AVRO 架构演变，记录类型的默认值

我正在尝试更新我们的 avro 架构以添加数据。

当我尝试使用新架构读取旧数据时遇到问题

使用新模式写入的数据没有问题。

当前架构：

{
    "type": "record",
    "name": "topLevelRecord",
    "fields": [
        {
            "name": "sub_record1",
            "type": [
                {
                    "type": "record",
                    "name": "sub_record1",
                    "fields": [...]
                }
            ]
        },
        {
            "name": "sub_record2",
            "type": [
                {
                    "type": "record",
                    "name": "sub_record2",
                    "fields": [...]
                }
            ]
        },
        {...}
    ]
}

Run Code Online (Sandbox Code Playgroud)

我的目标是添加sub_record3 具有以下架构的新子记录：

{
      "name": "sub_record3",
      "type": [
        {
          "type": "record",
          "name": "sub_record3",
          "fields": [
            {
              "name": "field1",
              "default": null,
              "type": [
                "null",
                "string"
              ]
            },
            {
              "name": "field2",
              "default": null,
              "type": [
                "null",
                "string"
              ]
            }]
        }]
    }

Run Code Online (Sandbox Code Playgroud)

我的问题是当我尝试添加默认值时sub_record3。

我尝试了以下方法： default = {}、、、、、 default = {"sub_record3":{}}default = {"sub_record3":{"field1":null, "field2":null}}default = {"sub_record3":{"field1":"", "field2":""}}default = {"field1":null, "field2":null}

但这些都不起作用。

目前，我们使用基于将 null 添加到sub_record3类型并将其用作默认值的解决方法，但是当通过 hive 读取数据时，它会显示NULL。

目标是{"field1":null, "field2":null}当数据中不存在 sub_record3 时返回 sub_record3 字段的值。

归档时间：	8 年，11 月前
查看次数：	858 次
最近记录：	8 年，11 月前