使用Google BigQuery Client API在BigQuery中加载JSON文件

iBr*_*aAa 6 python google-bigquery

有没有办法使用Google BigQuery Client API将JSON文件从本地文件系统加载到BigQuery?

我发现的所有选项都是:

1-逐个流式传输记录.

2-从GCS加载JSON数据.

3-使用原始POST请求加载JSON(即不通过Google Client API).

Jor*_*ani 3

我从 python 标签假设您想从 python 执行此操作。这里有一个加载示例,它从本地文件加载数据(它使用 CSV,但很容易将其改编为 JSON...同一目录中还有另一个 json 示例)。

基本流程是:

# Load configuration with the destination specified.
load_config = {
  'destinationTable': {
    'projectId': PROJECT_ID,
    'datasetId': DATASET_ID,
    'tableId': TABLE_ID
  }
}

load_config['schema'] = {
  'fields': [
    {'name':'string_f', 'type':'STRING'},
    {'name':'boolean_f', 'type':'BOOLEAN'},
    {'name':'integer_f', 'type':'INTEGER'},
    {'name':'float_f', 'type':'FLOAT'},
    {'name':'timestamp_f', 'type':'TIMESTAMP'}
  ]
}
load_config['sourceFormat'] = 'NEWLINE_DELIMITED_JSON'

# This tells it to perform a resumable upload of a local file
# called 'foo.json' 
upload = MediaFileUpload('foo.json',
                         mimetype='application/octet-stream',
                         # This enables resumable uploads.
                         resumable=True)

start = time.time()
job_id = 'job_%d' % start
# Create the job.
result = jobs.insert(
  projectId=project_id,
  body={
    'jobReference': {
      'jobId': job_id
    },
    'configuration': {
      'load': load
    }
  },
  media_body=upload).execute()

 # Then you'd also want to wait for the result and check the status. (check out
 # the example at the link for more info).
Run Code Online (Sandbox Code Playgroud)