Avro架构演变

mag*_*alo 12 avro

我有两个问题:

  1. 是否可以使用相同的阅读器并解析用两个兼容的模式编写的记录,例如,Schema V2只有一个额外的可选字段Schema V1,我想让读者理解这两个?我认为这里的答案是否定的,但如果是,我该怎么做?

  2. 我曾尝试用Schema V1它编写记录并阅读它Schema V2,但是我收到以下错误:

    org.apache.avro.AvroTypeException:找到了foo,期待foo

我用过avro-1.7.3和:

   writer = new GenericDatumWriter<GenericData.Record>(SchemaV1);
   reader = new GenericDatumReader<GenericData.Record>(SchemaV2, SchemaV1);
Run Code Online (Sandbox Code Playgroud)

以下是两个模式的示例(我也尝试过添加命名空间,但没有运气).

架构V1:

{
"name": "foo",
"type": "record",
"fields": [{
    "name": "products",
    "type": {
        "type": "array",
        "items": {
            "name": "product",
            "type": "record",
            "fields": [{
                "name": "a1",
                "type": "string"
            }, {
                "name": "a2",
                "type": {"type": "fixed", "name": "a3", "size": 1}
            }, {
                "name": "a4",
                "type": "int"
            }, {
                "name": "a5",
                "type": "int"
            }]
        }
    }
}]
}
Run Code Online (Sandbox Code Playgroud)

架构V2:

{
"name": "foo",
"type": "record",
"fields": [{
    "name": "products",
    "type": {
        "type": "array",
        "items": {
            "name": "product",
            "type": "record",
            "fields": [{
                "name": "a1",
                "type": "string"
            }, {
                "name": "a2",
                "type": {"type": "fixed", "name": "a3", "size": 1}
            }, {
                "name": "a4",
                "type": "int"
            }, {
                "name": "a5",
                "type": "int"
            }]
        }
    }
},
{
            "name": "purchases",
            "type": ["null",{
                    "type": "array",
                    "items": {
                            "name": "purchase",
                            "type": "record",
                            "fields": [{
                                    "name": "a1",
                                    "type": "int"
                            }, {
                                    "name": "a2",
                                    "type": "int"
                            }]
                    }
            }]
}]
} 
Run Code Online (Sandbox Code Playgroud)

提前致谢.

Bew*_*ang 10

我遇到了同样的问题.这可能是avro的一个bug,但你可能可以通过在"purchase"字段中添加"default":null来解决这个问题.

查看我的博客了解详情:http://ben-tech.blogspot.com/2013/05/avro-schema-evolution.html

  • 使用模式演变时,必须使用默认值.如果没有为reader模式中存在但不在writer模式中的字段提供默认值,则Avro无法弄清楚如何在解析的结构中创建此新字段. (6认同)