从Python输出XML最容易的非内存密集型方法是什么?

Yuv*_*uvi 14 python xml streaming

基本上,类似于System.Xml.XmlWriter - 流式XML Writer,它不会产生大量的内存开销.因此排除了xml.dom和xml.dom.minidom.建议?

Tom*_*ham 15

我想你会发现xml.sax.saxutils中的XMLGenerator最接近你想要的东西.

import time
from xml.sax.saxutils import XMLGenerator
from xml.sax.xmlreader import AttributesNSImpl

LOG_LEVELS = ['DEBUG', 'WARNING', 'ERROR']


class xml_logger:
    def __init__(self, output, encoding):
        """
        Set up a logger object, which takes SAX events and outputs
        an XML log file
        """
        logger = XMLGenerator(output, encoding)
        logger.startDocument()
        attrs = AttributesNSImpl({}, {})
        logger.startElementNS((None, u'log'), u'log', attrs)
        self._logger = logger
        self._output = output
        self._encoding = encoding
        return

    def write_entry(self, level, msg):
        """
        Write a log entry to the logger
        level - the level of the entry
        msg   - the text of the entry.  Must be a Unicode object
        """
        #Note: in a real application, I would use ISO 8601 for the date
        #asctime used here for simplicity
        now = time.asctime(time.localtime())
        attr_vals = {
            (None, u'date'): now,
            (None, u'level'): LOG_LEVELS[level],
            }
        attr_qnames = {
            (None, u'date'): u'date',
            (None, u'level'): u'level',
            }
        attrs = AttributesNSImpl(attr_vals, attr_qnames)
        self._logger.startElementNS((None, u'entry'), u'entry', attrs)
        self._logger.characters(msg)
        self._logger.endElementNS((None, u'entry'), u'entry')
        return

    def close(self):
        """
        Clean up the logger object
        """
        self._logger.endElementNS((None, u'log'), u'log')
        self._logger.endDocument()
        return

if __name__ == "__main__":
    #Test it out
    import sys
    xl = xml_logger(sys.stdout, 'utf-8')
    xl.write_entry(2, u"Vanilla log entry")
    xl.close()   

您可能希望查看我从http://www.xml.com/pub/a/2003/03/12/py-xml.html获得的文章的其余部分.


thr*_*thr -4

xml.etree.cElementTree,自 2.5 起包含在 CPython 的默认发行版中。读取和写入 XML 的速度快如闪电。

  • 尽管如此,cElementTree 并不是流式编写器,因此将使用与您创建的 XML 树的大小成线性关系的内存,尽管比 xml.dom 少得多。 (4认同)