br1*_*br1 2 python csv storage google-bigquery google-cloud-platform
我正在尝试从 csv 文件在现有 bigquery 表中追加新行。csv 是:
"sprotocol";"w5q53";"insertingdate";"closeddate";"sollectidate";"company";"companyid";"contact"
"20-22553";"DELETED";"2020-01-26;0000-01-01 00:00";"0000-01-01 00:00";"";"";"this is a ticket"
Run Code Online (Sandbox Code Playgroud)
这是我的 python 函数:
job_config = bigquery.LoadJobConfig()
job_config.source_format = 'text/csv'
job_config.write_disposition = bigquery.WriteDisposition.WRITE_APPEND
job_config.source_format = bigquery.SourceFormat.CSV
job_config.skip_leading_rows = 1
job_config.autodetect = False
job_config.schema = [
bigquery.SchemaField("sprotocol", "STRING", mode="NULLABLE"),
bigquery.SchemaField("w5q53", "STRING", mode="NULLABLE"),
bigquery.SchemaField("insertingdate", "TIMESTAMP", mode="NULLABLE"),
bigquery.SchemaField("closeddate", "STRING", mode="NULLABLE"),
bigquery.SchemaField("sollectidate", "STRING", mode="NULLABLE"),
bigquery.SchemaField("company", "STRING", mode="NULLABLE"),
bigquery.SchemaField("companyid", "STRING", mode="NULLABLE"),
bigquery.SchemaField("contact", "STRING", mode="NULLABLE")
]
job_config.fieldDelimiter = ';'
job_config.allow_quoted_newlines = True
with open(file_path, "rb") as file:
load_job = _connection.load_table_from_file(
file,
table_ref,
job_config=job_config
) # API request
print("Starting job {}".format(load_job.job_id))
load_job.result() # Waits for table load to complete.
print("Job finished.")
file.close()
Run Code Online (Sandbox Code Playgroud)
我收到以下错误:
[{'reason': 'invalid', 'message': 'Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.'}, {'reason': 'invalid', 'message': 'Error while reading data, error message: CSV table references column position 55, but line starting at position:743 contains only 1 columns.'}]
Run Code Online (Sandbox Code Playgroud)
我也尝试删除架构定义,但收到相同的错误。有人可以帮助我吗?
上面的代码存在三个问题
使用field_delimiter
而不是fieldDelimiter
job_config.field_delimiter = ';'
使用DATE
代替,TIMESTAMP
因为输入仅包含日期
bigquery.SchemaField("insertingdate", "DATE", mode="NULLABLE"),
双引号不正确
"20-22553";"DELETED";"2020-01-26";"0000-01-01 00:00";"0000-01-01 00:00";"";"";"this is a ticket"
归档时间: |
|
查看次数: |
4787 次 |
最近记录: |