AME*_*AME 6 google-sheets google-bigquery
import pandas as pd
from google.cloud import bigquery
import google.auth
# from google.cloud import bigquery
# Create credentials with Drive & BigQuery API scopes
# Both APIs must be enabled for your project before running this code
credentials, project = google.auth.default(scopes=[
'https://www.googleapis.com/auth/drive',
'https://www.googleapis.com/auth/spreadsheets',
'https://www.googleapis.com/auth/bigquery',
])
client = bigquery.Client(credentials=credentials, project=project)
# Configure the external data source and query job
external_config = bigquery.ExternalConfig('GOOGLE_SHEETS')
# Use a shareable link or grant viewing access to the email address you
# used to authenticate with BigQuery (this example Sheet is public)
sheet_url = (
'https://docs.google.com/spreadsheets'
'/d/1uknEkew2C3nh1JQgrNKjj3Lc45hvYI2EjVCcFRligl4/edit?usp=sharing')
external_config.source_uris = [sheet_url]
external_config.schema = [
bigquery.SchemaField('name', 'STRING'),
bigquery.SchemaField('post_abbr', 'STRING')
]
external_config.options.skip_leading_rows = 1 # optionally skip header row
table_id = 'BambooHRActiveRoster'
job_config = bigquery.QueryJobConfig()
job_config.table_definitions = {table_id: external_config}
# Get Top 10
sql = 'SELECT * FROM workforce.BambooHRActiveRoster LIMIT 10'
query_job = client.query(sql, job_config=job_config) # API request
top10 = list(query_job) # Waits for query to finish
print('There are {} states with names starting with W.'.format(
len(top10)))
Run Code Online (Sandbox Code Playgroud)
我得到的错误是:
BadRequest: 400 Error while reading table: workforce.BambooHRActiveRoster, error message: Failed to read the spreadsheet. Errors: No OAuth token with Google Drive scope was found.
Run Code Online (Sandbox Code Playgroud)
我可以从通过 CSV 上传创建的 BigQuery 表中提取数据,但是当我从链接的 Google 表格创建 BigQuery 表时,我继续收到此错误。
我试图复制 Google 文档中的示例(创建和查询临时表):
您正在以您自己的身份进行身份验证,如果您拥有正确的权限,这对于 BQ 来说通常没问题。使用链接到 Google 表格的表格通常需要服务帐户。创建一个(或让您的 BI/IT 团队创建一个),然后您必须与服务帐户共享底层 Google 表格。最后,您需要修改 python 脚本以使用服务帐户凭据,而不是您自己的凭据。
解决此问题的快速方法是使用select *
Sheets 链接表中的 BQ 界面,将结果保存到新表中,然后直接在 Python 脚本中查询该新表。如果这是一次性上传/分析,则效果很好。如果工作表中的数据持续变化,并且您需要定期查询数据,那么这不是一个长期解决方案。
归档时间: |
|
查看次数: |
4909 次 |
最近记录: |