列出BigQuery中的预定查询

Old*_*Ted 3 google-apps-script google-bigquery

我需要(以编程方式)分析BigQuery中的计划查询的详细信息(例如,更新哪些表以及在SQL中访问哪些表)。我已经使用Apps脚本对BQ表/视图进行了类似的操作BigQuery.Tables.list(),但是找不到用于访问计划查询的API。

UI可以列出它们,因此我认为应该可以通过编程方式实现,例如通过REST API。有谁知道这是否可行,支持什么接口(App脚本,REST ...),以及可能的使用方式示例。

Gui*_*ins 5

Scheduled queries are part of BigQuery's Data Transfer Service so you have to use its API. In particular, the projects.transferConfigs.list method. Fill in the dataSourceIds field with scheduled_query and parent with projects/PROJECT_ID. As discussed in the comments, if you are using a regional location such as europe-west2 instead of a multi-regional one (EU or US) you should use projects.locations.transferConfigs.list instead. Now, parent resource will be in the form of projects/PROJECT_ID/locations/REGIONAL_LOCATION.

In addition, for other transfers you can get the corresponding dataSourceIds using the projects.dataSources.list method. That's how I got the scheduled_query one.

Response will be an array of scheduled queries such as:

{
  "name": "projects/<PROJECT_NUMBER>/locations/us/transferConfigs/<TRANSFER_CONFIG_ID>",
  "destinationDatasetId": "<DATASET>",
  "displayName": "hacker-news",
  "updateTime": "2018-11-14T15:39:18.897911Z",
  "dataSourceId": "scheduled_query",
  "schedule": "every 24 hours",
  "nextRunTime": "2019-04-19T15:39:00Z",
  "params": {
    "write_disposition": "WRITE_APPEND",
    "query": "SELECT @run_time AS time,\n  title,\n  author,\n  text\nFROM `bigquery-public-data.hacker_news.stories`\nLIMIT\n  1000",
    "destination_table_name_template": "hacker_daily_news"
  },
  "state": "SUCCEEDED",
  "userId": "<USER_ID>",
  "datasetRegion": "us"
}
Run Code Online (Sandbox Code Playgroud)

Example of an API call with bash and curl:

{
  "name": "projects/<PROJECT_NUMBER>/locations/us/transferConfigs/<TRANSFER_CONFIG_ID>",
  "destinationDatasetId": "<DATASET>",
  "displayName": "hacker-news",
  "updateTime": "2018-11-14T15:39:18.897911Z",
  "dataSourceId": "scheduled_query",
  "schedule": "every 24 hours",
  "nextRunTime": "2019-04-19T15:39:00Z",
  "params": {
    "write_disposition": "WRITE_APPEND",
    "query": "SELECT @run_time AS time,\n  title,\n  author,\n  text\nFROM `bigquery-public-data.hacker_news.stories`\nLIMIT\n  1000",
    "destination_table_name_template": "hacker_daily_news"
  },
  "state": "SUCCEEDED",
  "userId": "<USER_ID>",
  "datasetRegion": "us"
}
Run Code Online (Sandbox Code Playgroud)


Mar*_*zio 5

上面的回答是关于使用 REST API 的很好的回答。为了完整起见,我想包含 CLI 命令方法来解决同样的问题。就我个人而言,我发现这更适合 shell 脚本,但 YMMV。

示例:默认项目的计划查询列表。

bq ls --transfer_config --transfer_location=US --format=prettyjson
Run Code Online (Sandbox Code Playgroud)

示例:默认项目的计划查询的详细信息。

bq show --format=prettyjson --transfer_config [RESOURCE_NAME]
# RESOURCE_NAME is a value you can get from the above bq ls command.
Run Code Online (Sandbox Code Playgroud)

更多详细信息可以在这里找到。