小编Art*_*ike的帖子

Spark BigQuery Connector:编写ARRAY类型会导致异常:""无效值:ARRAY不是有效值""

在Google Cloud Dataproc中运行Spark作业.使用BigQuery Connector将作业的json数据输出加载到BigQuery表中.

BigQuery Standard-SQL数据类型文档说明支持ARRAY类型.

我的Scala代码是:

val outputDatasetId = "mydataset"
val tableSchema = "["+
    "{'name': '_id', 'type': 'STRING'},"+
    "{'name': 'array1', 'type': 'ARRAY'},"+
    "{'name': 'array2', 'type': 'ARRAY'},"+
    "{'name': 'number1', 'type': 'FLOAT'}"+
    "]"

// Output configuration
BigQueryConfiguration.configureBigQueryOutput(
    conf, projectId, outputDatasetId, "outputTable", 
    tableSchema)

//Write visits to BigQuery
jsonData.saveAsNewAPIHadoopDataset(conf)
Run Code Online (Sandbox Code Playgroud)

但这项工作抛出了这个例外:

{
  "code" : 400,
  "errors" : [ {
  "domain" : "global",
  "message" : "Invalid value for: ARRAY is not a valid value",
  "reason" : "invalid"
   } ],
  "message" : "Invalid …
Run Code Online (Sandbox Code Playgroud)

hadoop scala google-bigquery apache-spark google-cloud-dataproc

1
推荐指数
1
解决办法
503
查看次数