Unmarshalling XML to three lists of different objects using STAX Parser

Nic*_*Div 11 java xml stax jaxb xml-parsing

Is there a way I can use STAX parser to efficiently parse an XML document with multiple lists of objects of different classes (POJO). The exact structure of my XML is as follows (class names are not real)

<?xml version="1.0" encoding="utf-8"?>
<root>
    <notes />
    <category_alpha>
        <list_a>
            <class_a_object></class_a_object>
            <class_a_object></class_a_object>
            <class_a_object></class_a_object>
            <class_a_object></class_a_object>
            .
            .
            .
        </list_a>
        <list_b>
            <class_b_object></class_b_object>
            <class_b_object></class_b_object>
            <class_b_object></class_b_object>
            <class_b_object></class_b_object>
            .
            .
            .
        </list_b>
    </category_alpha>
    <category_beta>
        <class_c_object></class_c_object>
        <class_c_object></class_c_object>
        <class_c_object></class_c_object>
        <class_c_object></class_c_object>
        <class_c_object></class_c_object>
        .
        .
        .
        .
        .
    </category_beta>
</root>
Run Code Online (Sandbox Code Playgroud)

I have been using the STAX Parser i.e. XStream library, link: XStream

It works absolutely fine as long as my XML contains list of one class of objects but I dont know how to handle an XML that contains list of objects of different classes.

Any help would be really appreciated and please let me know if I have not provided enough information or I haven't phrased the question properly.

mfe*_*mfe 7

您可以使用声明性流映射(DSM)流解析库轻松将复杂的XML转换为Java类。它使用StAX解析XML。

我跳过获取notes标记并在class_x_object标记内添加一个字段进行演示。

这是XML:

<?xml version="1.0" encoding="utf-8"?>
<root>
    <notes />
    <category_alpha>
        <list_a>
            <class_a_object>
                <fieldA>A1</fieldA>
            </class_a_object>
            <class_a_object>
                <fieldA>A2</fieldA>
            </class_a_object>
            <class_a_object>
                <fieldA>A3</fieldA>
            </class_a_object>

        </list_a>
        <list_b>
            <class_b_object>
                <fieldB>B1</fieldB>
            </class_b_object>
            <class_b_object>
                <fieldB>B2</fieldB>
            </class_b_object>
            <class_b_object>
                <fieldB>B3</fieldB>
            </class_b_object>
        </list_b>
    </category_alpha>
    <category_beta>
        <class_c_object>
          <fieldC>C1</fieldC>
        </class_c_object>
        <class_c_object>
          <fieldC>C2</fieldC>
        </class_c_object>
        <class_c_object>
          <fieldC>C3</fieldC>
        </class_c_object>
    </category_beta>
</root>
Run Code Online (Sandbox Code Playgroud)

首先,您必须以yaml或JSON格式定义XML数据和您的类字段之间的映射。

以下是映射定义:

result:     
   type: object
   path: /root   
   fields:
     listOfA:
       type: array
       path: .*class_a_object  # path is regex
       fields:
          fieldOfA:
            path: fieldA
     listOfB:
       type: array
       path: .*class_b_object
       fields:
          fieldOfB:
            path: fieldB 
     listOfC:
       type: array
       path: .*class_c_object
       fields:
          fieldOfC:
            path: fieldC 
Run Code Online (Sandbox Code Playgroud)

您要反序列化的Java类:

public class Root {
    public List<A> listOfA;
    public List<B> listOfB;
    public List<C> listOfC;

    public static class A{
        public String fieldOfA;
    }
    public static class B{
        public String fieldOfB;
    }
    public static class C{
        public String fieldOfC;
    }

}   
Run Code Online (Sandbox Code Playgroud)

解析XML的Java代码:

DSM dsm=new DSMBuilder(new File("path/to/mapping.yaml")).setType(DSMBuilder.TYPE.XML).create(Root.class);
Root root =  (Root)dsm.toObject(xmlFileContent);
// write root object as json
dsm.getObjectMapper().writerWithDefaultPrettyPrinter().writeValue(System.out, object);
Run Code Online (Sandbox Code Playgroud)

输出如下:

{
  "listOfA" : [ {"fieldOfA" : "A1"}, {"fieldOfA" : "A2"}, {"fieldOfA" : "A3"} ],
  "listOfB" : [ {"fieldOfB" : "B1"}, {"fieldOfB" : "B2"}, "fieldOfB" : "B3"} ],
  "listOfC" : [ {"fieldOfC" : "C1"}, {"fieldOfC" : "C2"}, {"fieldOfC" : "C3"} ]
}
Run Code Online (Sandbox Code Playgroud)

更新:

从您的评论中可以理解,您希望将大XML文件作为流读取。在读取文件时处理数据。

DSM允许您在读取XML时处理数据。

声明三种不同的功能来处理部分数据。

FunctionExecutor processA=new FunctionExecutor(){
            @Override
            public void execute(Params params) {

                Root.A object=params.getCurrentNode().toObject(Root.A.class);

                // process aClass; save to db. call service etc.
            }
        };
FunctionExecutor processB=new FunctionExecutor(){
            @Override
            public void execute(Params params) {

                Root.B object=params.getCurrentNode().toObject(Root.B.class);

                // process aClass; save to db. call service etc.
            }
        };

FunctionExecutor processC=new FunctionExecutor(){
            @Override
            public void execute(Params params) {

                Root.C object=params.getCurrentNode().toObject(Root.C.class);

                // process aClass; save to db. call service etc.
            }
        };
Run Code Online (Sandbox Code Playgroud)

向DSM注册功能

 DSMBuilder builder = new DSMBuilder(new File("path/to/mapping.yaml")).setType(DSMBuilder.TYPE.XML);

       // register function
        builder.registerFunction("processA",processA);
        builder.registerFunction("processB",processB);
        builder.registerFunction("processC",processC);

        DSM dsm= builder.create();
        Object object =  dsm.toObject(xmlContent);
Run Code Online (Sandbox Code Playgroud)

更改映射文件以调用注册功能

result:     
   type: object
   path: /root   
   fields:
     listOfA:
       type: object
       function: processA  # when 'class_a_object' tag closed processA function will be executed.
       path: .*class_a_object  # path is regex
       fields:
          fieldOfA:
            path: fieldA
     listOfB:
       type: object
       path: .*class_b_object
       function: processB# register function
       fields:
          fieldOfB:
            path: fieldB 
     listOfC:
       type: object
       path: .*class_c_object
       function: processC# register function
       fields:
          fieldOfC:
            path: fieldC 
Run Code Online (Sandbox Code Playgroud)