Google Bigquery中的REPEATED字段是什么意思?

han*_*s-t 17 google-bigquery

请在以下示例中检查我对REPEATED字段的理解:

{
    "title": "History of Alphabet",
    "author": [
        {
            "name": "Larry"
        },
    ]
}
Run Code Online (Sandbox Code Playgroud)

这个JSON有架构:

[
    {
        "name": "title",
        "type": "STRING"
    },
    {
        "name": "author",
        "type": "RECORD",
        "fields": [
            {
                "name": "name",
                "type": "STRING"
            }
        ]
    }
]
Run Code Online (Sandbox Code Playgroud)

但是以下JSON

{
    "title": "History of Alphabet",
    "author": ["Larry", "Steve", "Eric"]
}
Run Code Online (Sandbox Code Playgroud)

有架构:

[
    {
        "name": "title",
        "type": "STRING"
    },
    {
        "name": "author",
        "type": "STRING",
        "mode": "REPEATED"
    }
]
Run Code Online (Sandbox Code Playgroud)

它是否正确?

nb:我试图浏览文档,但找不到任何解释.

Jer*_*dit 19

关.在第一个示例中,author是一个对象数组,对应于BQ中的重复记录.架构将是:

[
    {
        "name": "title",
        "type": "STRING"
    },
    {
        "name": "author",
        "type": "RECORD",
        "mode": "REPEATED",   <--- NOTE!
        "fields": [
            {
                "name": "name",
                "type": "STRING"
            }
        ]
    }
]
Run Code Online (Sandbox Code Playgroud)

您的第二个数据/模式对看起来很好(但请注意,整个架构是一个数组,而不是一个对象,它需要元素之间的逗号).

这里有一些关于嵌套和重复字段的讨论:https://cloud.google.com/bigquery/docs/data?hl = en#nested

此处还有一些示例JSON数据对象:https: //cloud.google.com/bigquery/preparing-data-for-bigquery#dataformats

但我同意我们没有很好地解释这些对象如何映射到BQ模式.对于那个很抱歉!