为嵌套表创建一个模式 - bigquery

Mar*_*ver 3 google-bigquery

我正在尝试上传一个包含多个嵌套级别的测试数据表,但我似乎无法获得指定模式的语法.

这是我目前的架构文件:

{
  "name":"city", "type":"RECORD",
    [
      {"name":"id", "type":"INTEGER"},
      {"name":"name", "type":"STRING"},
      {"name":"country", "type":"STRING"},
      {"name":"coord", "type":"RECORD"},
        [
          {"name":"lon", "type":"FLOAT"},
          {"name":"lat", "type":"FLOAT"}
        ],
    {"name":"time", "type":"TIMESTAMP"}
  ]
}
Run Code Online (Sandbox Code Playgroud)

以下是数据示例:

{"city":{"id":1283240,"name":"Kathmandu","country":"NP","coord":{"lon":85.316666,"lat":27.716667}},"time":1394865171,"data":[{"dt":1394852400,"main":{"temp":296.15,"temp_min":293.866,"temp_max":296.15}},{"dt":1394863200,"main":{"temp":301.51,"temp_min":299.345,"temp_max":301.51}}]}
Run Code Online (Sandbox Code Playgroud)

在完整文件中,我有多个城市,每个城市每天都有多个"数据"点.

谢谢

标记

Jor*_*ani 7

如果您有RECORD类型,则需要命名模式JSON数组fields:.如:

{
  "name":"city", "type":"RECORD", 
  "fields": [
      {"name":"id", "type":"INTEGER"},
      {"name":"name", "type":"STRING"},
      {"name":"country", "type":"STRING"},
      {"name":"coord", "type":"RECORD",
      "fields": [
          {"name":"lon", "type":"FLOAT"},
          {"name":"lat", "type":"FLOAT"}
        ]},
    {"name":"time", "type":"TIMESTAMP"}
  ]
}
Run Code Online (Sandbox Code Playgroud)

还有一个问题是你}在错误的地方关闭内部架构.

我喜欢使用的一个技巧是使用Python的json.loads()函数来验证我是否真的创建了一个有效的JSON对象,因为有时很难弄清楚你是否拥有了所需的所有逗号并关闭了所有的逗号引用正确.例如:

$ python
>>> import json
>>> schema = """
... <paste your initial schema>
... """
>>> json.loads(schema)

ValueError: Expecting property name: line 4 column 5 (char 41)
Run Code Online (Sandbox Code Playgroud)

(它抱怨你有一个没有属性名的数组元素......你需要"字段").