iBr*_*aAa 6 python google-bigquery
有没有办法使用Google BigQuery Client API将JSON文件从本地文件系统加载到BigQuery?
我发现的所有选项都是:
1-逐个流式传输记录.
2-从GCS加载JSON数据.
3-使用原始POST请求加载JSON(即不通过Google Client API).
我从 python 标签假设您想从 python 执行此操作。这里有一个加载示例,它从本地文件加载数据(它使用 CSV,但很容易将其改编为 JSON...同一目录中还有另一个 json 示例)。
基本流程是:
# Load configuration with the destination specified.
load_config = {
'destinationTable': {
'projectId': PROJECT_ID,
'datasetId': DATASET_ID,
'tableId': TABLE_ID
}
}
load_config['schema'] = {
'fields': [
{'name':'string_f', 'type':'STRING'},
{'name':'boolean_f', 'type':'BOOLEAN'},
{'name':'integer_f', 'type':'INTEGER'},
{'name':'float_f', 'type':'FLOAT'},
{'name':'timestamp_f', 'type':'TIMESTAMP'}
]
}
load_config['sourceFormat'] = 'NEWLINE_DELIMITED_JSON'
# This tells it to perform a resumable upload of a local file
# called 'foo.json'
upload = MediaFileUpload('foo.json',
mimetype='application/octet-stream',
# This enables resumable uploads.
resumable=True)
start = time.time()
job_id = 'job_%d' % start
# Create the job.
result = jobs.insert(
projectId=project_id,
body={
'jobReference': {
'jobId': job_id
},
'configuration': {
'load': load
}
},
media_body=upload).execute()
# Then you'd also want to wait for the result and check the status. (check out
# the example at the link for more info).
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1856 次 |
| 最近记录: |