如何在BIGQUERY加载API中跳过csv文件行

Question

如何在BIGQUERY加载API中跳过csv文件行

Shi*_*kha 3 google-cloud-storage google-bigquery

我正在尝试使用BigQuery API将CSV数据从云存储桶加载到BigQuery表我的代码是:

def load_data_from_gcs(dataset_name, table_name, source):
    bigquery_client = bigquery.Client()
    dataset = bigquery_client.dataset(dataset_name)
    table = dataset.table(table_name)
    job_name = str(uuid.uuid4())

    job = bigquery_client.load_table_from_storage(
        job_name, table, source)
    job.sourceFormat = 'CSV'
    job.fieldDelimiter = ','
    job.skipLeadingRows = 2

    job.begin()
    job.result()  # Wait for job to complete

    print('Loaded {} rows into {}:{}.'.format(
        job.output_rows, dataset_name, table_name))

    wait_for_job(job)

Run Code Online (Sandbox Code Playgroud)

它给了我错误:

400 CSV table encountered too many errors, giving up. Rows: 1; errors: 1.

Run Code Online (Sandbox Code Playgroud)

这个错误是因为,我的csv文件包含前两行作为标题信息,不应该加载.我已经给了job.skipLeadingRows = 2但它没有跳过前两行.是否有其他语法来设置跳过行？

请帮忙.

Answer 1

Gra*_*ley 5

你拼错了(使用camelcase而不是下划线).这skip_leading_rows,不是skipLeadingRows.同样的field_delimiter和source_format.

在这里查看Python源代码.

归档时间：	8 年，1 月前
查看次数：	1007 次
最近记录：	8 年，1 月前