Joo*_*mel 10 pandas google-bigquery
我正在尝试通过 Pandas 将表附加到不同的表,从 BigQuery 中提取数据并将其发送到不同的 BigQuery 数据集。虽然表架构完全相同,但我收到错误““请验证结构和“pandas_gbq.gbq.InvalidSchema:请验证数据帧中的结构和数据类型是否与目标表的架构匹配。”
这个错误发生在我之前去覆盖表的地方,但在这种情况下,数据集太大而无法做到这一点(这不是一个可持续的解决方案)。
df = pd.read_gbq(query, project_id="my-project", credentials=bigquery_key,
dialect='standard')
pd.io.gbq.to_gbq(df, dataset, projectid,
if_exists='append',
table_schema=[{'name': 'Date','type': 'STRING'},
{'name': 'profileId','type': 'STRING'},
{'name': 'Opco','type': 'STRING'},
{'name': 'country','type': 'STRING'},
{'name': 'deviceType','type': 'STRING'},
{'name': 'userType','type': 'STRING'},
{'name': 'users','type': 'INTEGER'},
{'name': 'sessions','type': 'INTEGER'},
{'name': 'bounceRate','type': 'FLOAT'},
{'name': 'sessionsPerUser','type': 'FLOAT'},
{'name': 'avgSessionDuration','type': 'FLOAT'},
{'name': 'pageviewsPerSession','type': 'FLOAT'}
],
credentials=bigquery_key)
Run Code Online (Sandbox Code Playgroud)
BigQuery 中的架构如下:
Date STRING
profileId STRING
Opco STRING
country STRING
deviceType STRING
userType STRING
users INTEGER
sessions INTEGER
bounceRate FLOAT
sessionsPerUser FLOAT
avgSessionDuration FLOAT
pageviewsPerSession FLOAT
Run Code Online (Sandbox Code Playgroud)
然后我收到以下错误:
Traceback (most recent call last): File "..file.py", line 63, in
<module>
main()
File "..file.py", line 57, in main
updating_general_data(bigquery_key)
File "..file.py", line 46, in updating_general_data
credentials=bigquery_key)
File
"..\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\gbq.py",
line 162, in to_gbq
credentials=credentials, verbose=verbose, private_key=private_key)
File
"..\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas_gbq\gbq.py",
line 1141, in to_gbq
"Please verify that the structure and " pandas_gbq.gbq.InvalidSchema: Please verify that the structure and
data types in the DataFrame match the schema of the destination table.
Run Code Online (Sandbox Code Playgroud)
对我来说,似乎有一对一的比赛。我已经看到其他线程谈论这个,这些线程主要谈论日期格式,即使在这种情况下日期格式已经是一个字符串,然后 table_schema 仍然作为字符串。