Tee*_*jay 4 python csv google-bigquery
我正在尝试使用 python 将本地 CSV 上传到 google big query
def uploadCsvToGbq(self,table_name):
load_config = {
'destinationTable': {
'projectId': self.project_id,
'datasetId': self.dataset_id,
'tableId': table_name
}
}
load_config['schema'] = {
'fields': [
{'name':'full_name', 'type':'STRING'},
{'name':'age', 'type':'INTEGER'},
]
}
load_config['sourceFormat'] = 'CSV'
upload = MediaFileUpload('sample.csv',
mimetype='application/octet-stream',
# This enables resumable uploads.
resumable=True)
start = time.time()
job_id = 'job_%d' % start
# Create the job.
result = bigquery.jobs.insert(
projectId=self.project_id,
body={
'jobReference': {
'jobId': job_id
},
'configuration': {
'load': load_config
}
},
media_body=upload).execute()
return result
Run Code Online (Sandbox Code Playgroud)
当我运行它时它会抛出错误,例如
“NameError:全局名称‘MediaFileUpload’未定义”
是否需要任何模块,请帮忙。
在 GBQ 中上传到 csv 文件的最简单方法之一是通过 pandas。只需将 csv 文件导入 pandas (pd.read_csv())。然后从 pandas 到 GBQ (df.to_gbq(full_table_id, project_id=project_id))。
import pandas as pd
import csv
df=pd.read_csv('/..localpath/filename.csv')
df.to_gbq(full_table_id, project_id=project_id)
Run Code Online (Sandbox Code Playgroud)
或者你可以使用客户端API
from google.cloud import bigquery
import pandas as pd
df=pd.read_csv('/..localpath/filename.csv')
client = bigquery.Client()
dataset_ref = client.dataset('my_dataset')
table_ref = dataset_ref.table('new_table')
client.load_table_from_dataframe(df, table_ref).result()
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5318 次 |
| 最近记录: |