Jython中的SAX IncrementalParser

min*_*hee 10 python xml sax jython

Python标准库提供了xml.sax.xmlreader.IncrementalParser具有feed()方法的接口.Jython还提供xml.sax了使用Java SAX解析器实现的软件包,但它似乎没有提供IncrementalParser.

没有办法在Jython中逐步解析XML块?乍一看,我认为可以使用coroutine来实现greenlet,但我立刻意识到它不能在Jython中使用.

Nat*_*ial 3

您可以使用StAX。解析器StAX流类似于但维护一个光标,并允许您使用和SAX提取光标处的内容。hasNext()next()

以下代码改编自该 java 示例。请注意,这是我第一次尝试使用 jython,所以如果我做了一些非常规的事情,请不要绞死我,但该示例有效。

http://www.javacodegeeks.com/2013/05/parsing-xml-using-dom-sax-and-stax-parser-in-java.html

from javax.xml.stream import XMLStreamConstants, XMLInputFactory, XMLStreamReader
from java.io import ByteArrayInputStream;
from java.lang import String

xml = String(
"""<?xml version="1.0" encoding="ISO-8859-1"?>
<employees>
  <employee id="111">
    <firstName>Rakesh</firstName>
    <lastName>Mishra</lastName>
    <location>Bangalore</location>
  </employee>
  <employee id="112">
    <firstName>John</firstName>
    <lastName>Davis</lastName>
    <location>Chennai</location>
  </employee>
  <employee id="113">
    <firstName>Rajesh</firstName>
    <lastName>Sharma</lastName>
    <location>Pune</location>
  </employee>
</employees>
""")

class Employee:
    id = None
    firstName = None
    lastName = None
    location = None

    def __str__(self):
        return self.firstName + " " + self.lastName + "(" + self.id + ") " + self.location

factory = XMLInputFactory.newInstance();
reader = factory.createXMLStreamReader(ByteArrayInputStream(xml.getBytes()))
employees = []
employee = None
tagContent = None

while reader.hasNext():
    event = reader.next();

    if event == XMLStreamConstants.START_ELEMENT:
        if "employee" == reader.getLocalName():
            employee = Employee()
            employee.id = reader.getAttributeValue(0)
    elif event == XMLStreamConstants.CHARACTERS:
        tagContent = reader.getText()
    elif event == XMLStreamConstants.END_ELEMENT:
        if "employee" == reader.getLocalName():
            employees.append(employee)
        elif "firstName" == reader.getLocalName():
            employee.firstName = tagContent
        elif "lastName" == reader.getLocalName():
            employee.lastName = tagContent
        elif "location" == reader.getLocalName():
            employee.location = tagContent

for employee in employees:
    print employee
Run Code Online (Sandbox Code Playgroud)