罪魁祸首如下。它应该由14列组成,其中一列以“嗨,我是尼日尔...”开头,并用换行符覆盖多行。
17935,9a7105ee-30c8-4a6d-9374-10875b7d6288.jpg,"""top""=>""0"", ""left""=>""0"", ""width""=>""180"", ""height""=>""180""",,"",2015-07-26 19:33:57.292058,2015-07-26 20:25:30.068887,fe43876f-1b2c-464a-aa20-bf335ed3ff62,c68c8c70-bc2b-11e4-90a1-22000b21105f,{},2e790350-15fb-0133-2cb8-22000ba51078,"Hi I'm Nigerian so wish to study in sweden.
so I'm Undergraduate student I want study Engineering.
Thanks.","",{}
Run Code Online (Sandbox Code Playgroud)
通过命令将此CSV数据加载到BigQuery中时bq load --replace --source_format=CSV -F"," ...,会报错。谁能给我这个BigQuery Load Data命令的解决方案?
- File: 0 / Line:17192 / Field:12: Missing close double quote (")
character: field starts with: <Hi I'm N>
- File: 0 / Line:17193: Too few columns: expected 14 column(s) but
got 1 column(s). For additional help: http://goo.gl/RWuPQ
- File: 0 / Line:17194: Too few columns: expected 14 column(s) but
got 3 column(s). For additional help: http://goo.gl/RWuPQ
Run Code Online (Sandbox Code Playgroud)
如果您要使用嵌入式换行符加载CSV,则需要指定allowQuotedNewlines。
https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load.allowQuotedNewlines
BigQuery的默认设置是假设CSV数据不包含换行符。由于可以在任意换行符处拆分输入文件,因此在处理大型数据文件时可以提供更高的解析吞吐量。如果数据在字符串中包含换行符,则每个文件都需要由一台机器线性解析。
小智 5
确保在将数据加载到 BigQuery 之前包含此行:'job_config.allow_quoted_newlines = True'
job_config = bigquery.LoadJobConfig()
job_config.allow_quoted_newlines = True
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
846 次 |
| 最近记录: |