ASH*_*ASH 2 python sql r python-3.x google-bigquery
我想弄清楚如何在 Google Big Query 中列出所有项目中所有表的所有大小。也许它是多个表的 SQL 联合。虽然,我在这里查看了很多表格,所以我想要某种自动化解决方案。我可以使用 R 代码来执行此任务。或者我什至用 Python 来做这件事。如果这里有人有列出一些指标的解决方案,主要是每个对象(表)的大小,以及其他相关指标,请在此处分享。非常感谢!
这个 Python 示例列出了所有项目中的所有表及其大小(以字节为单位)。您可以以此为例构建适合您的用例的脚本:
from google.cloud import bigquery
from google.cloud.bigquery import Dataset
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
# credentials to list project
credentials = GoogleCredentials.get_application_default()
service = discovery.build('cloudresourcemanager', 'v1', credentials=credentials)
# list project
request = service.projects().list()
response = request.execute()
# Main loop for project
for project in response.get('projects', []):
client = bigquery.Client(project['projectId']) # Start the client in the right project
# list dataset
datasets = list(client.list_datasets())
if datasets: # If there is some BQ dataset
print('Datasets in project {}:'.format(project['name']))
# Second loop to list the tables in the dataset
for dataset in datasets:
print(' - {}'.format(dataset.dataset_id))
get_size = client.query("select table_id, size_bytes as size from "+dataset.dataset_id+".__TABLES__") # This query retrieve all the tables in the dataset and the size in bytes. It can be modified to get more fields.
tables = get_size.result() # Get the result
# Third loop to list the tables and print the result
for table in tables:
print('\t{} size: {}'.format(table.table_id,table.size))
Run Code Online (Sandbox Code Playgroud)
参考:
列出项目:https :
//cloud.google.com/resource-manager/reference/rest/v1/projects/list#embedded-explorer
列出数据集:https :
//cloud.google.com/bigquery/docs/datasets#bigquery-list-datasets-python
| 归档时间: |
|
| 查看次数: |
3700 次 |
| 最近记录: |