Ish*_*ati 5 python google-bigquery
我在循环中使用下面提到的get_data_from_bq方法查询来自bigquery的数据:
def get_data_from_bq(product_ids):
format_strings = ','.join([("\"" + str(_id) + "\"") for _id in product_ids])
query = "select productId, eventType, count(*) as count from [xyz:xyz.abc] where productId in (" + format_strings + ") and eventTime > CAST(\"" + time_thresh +"\" as DATETIME) group by eventType, productId order by productId;"
query_job = bigquery_client.query(query, job_config=job_config)
return query_job.result()
Run Code Online (Sandbox Code Playgroud)
虽然对于第一个查询(迭代)返回的数据是正确的,但所有后续查询都抛出了下面提到的异常
results = query_job.result()
File "/home/ishank/.local/lib/python2.7/site-packages/google/cloud/bigquery/job.py", line 2415, in result
super(QueryJob, self).result(timeout=timeout)
File "/home/ishank/.local/lib/python2.7/site-packages/google/cloud/bigquery/job.py", line 660, in result
return super(_AsyncJob, self).result(timeout=timeout)
File "/home/ishank/.local/lib/python2.7/site-packages/google/api_core/future/polling.py", line 120, in result
raise self._exception
google.api_core.exceptions.BadRequest: 400 Cannot explicitly modify anonymous table xyz:_bf4dfedaed165b3ee62d8a9efa.anon1db6c519_b4ff_dbc67c17659f
Run Code Online (Sandbox Code Playgroud)
编辑1:下面是一个抛出上述异常的示例查询.此外,这在bigquery控制台中运行顺利.
select productId, eventType, count(*) as count from [xyz:xyz.abc] where productId in ("168561","175936","161684","161681","161686") and eventTime > CAST("2018-05-30 11:21:19" as DATETIME) group by eventType, productId order by productId;
Run Code Online (Sandbox Code Playgroud)
小智 8
我有完全相同的问题.问题不在于查询本身,而是您最有可能重复使用相同的问题QueryJobConfig.当您执行查询时,除非您设置了一个查询,否则destinationBigQuery会将结果存储在QueryJobConfig对象中声明的匿名表中.如果重用此配置,BigQuery会尝试将新结果存储在同一个匿名表中,从而导致错误.说实话,我并不特别喜欢这种行为.
你应该重写你的代码:
def get_data_from_bq(product_ids):
format_strings = ','.join([("\"" + str(_id) + "\"") for _id in product_ids])
query = "select productId, eventType, count(*) as count from [xyz:xyz.abc] where productId in (" + format_strings + ") and eventTime > CAST(\"" + time_thresh +"\" as DATETIME) group by eventType, productId order by productId;"
query_job = bigquery_client.query(query, job_config=QueryJobConfig())
return query_job.result()
Run Code Online (Sandbox Code Playgroud)
希望这可以帮助!
| 归档时间: |
|
| 查看次数: |
1500 次 |
| 最近记录: |