Bre*_*nor 2 python xml lxml dtd
我正在尝试编写一个验证脚本,该脚本将针对NITF DTD验证XML,http: //www.iptc.org/std/NITF/3.4/specification/dtd/nitf-3-4.dtd .基于这篇文章,我提出了以下简单的脚本来验证NITF XML文档.Bellow是我在运行脚本时得到的错误消息,它不是很具描述性并且很难调试.任何帮助表示赞赏.
#!/usr/bin/env python
def main():
from lxml import etree, objectify
from StringIO import StringIO
f = open('nitf_test.xml')
xml_doc = f.read()
f.close()
f = open('nitf-3-4.dtd')
dtd_doc = f.read()
f.close()
dtd = etree.DTD(StringIO(dtd_doc))
tree = objectify.parse(StringIO(xml_doc))
dtd.validate(tree)
if __name__ == '__main__':
main()
Run Code Online (Sandbox Code Playgroud)
回溯错误消息:
Traceback (most recent call last):
File "./test_nitf_doc.py", line 23, in <module>
main()
File "./test_nitf_doc.py", line 16, in main
dtd = etree.DTD(StringIO(dtd_doc))
File "dtd.pxi", line 43, in lxml.etree.DTD.__init__ (src/lxml/lxml.etree.c:126056)
File "dtd.pxi", line 117, in lxml.etree._parseDtdFromFilelike (src/lxml/lxml.etree.c:126727)
lxml.etree.DTDParseError: error parsing DTD
Run Code Online (Sandbox Code Playgroud)
如果我换行:
dtd = etree.DTD(StringIO(dtd_doc))
Run Code Online (Sandbox Code Playgroud)
至:
dtd = etree.DTD(dtd_doc)
Run Code Online (Sandbox Code Playgroud)
我得到的错误是:
lxml.etree.DTDParseError: failed to load external entity "NULL"
Run Code Online (Sandbox Code Playgroud)
我看了一眼,nitf-3-4.dtd发现它引用了一个外部模块xhtml-ruby-1.mod,可以在这个链接上下载.这需要存在于当前目录中,以便DTD解析器可以加载它.
完整的工作示例(假设您有一个有效的NITF文档):
% wget http://www.iptc.org/std/NITF/3.4/specification/dtd/nitf-3-4.dtd
% wget http://www.iptc.org/std/NITF/3.4/specification/dtd/xhtml-ruby-1.mod
Run Code Online (Sandbox Code Playgroud)
Python代码:
from lxml import etree, objectify
dtd = etree.DTD(open('nitf-3-4.dtd', 'rb'))
tree = objectify.parse(open('nitf_test.xml', 'rb'))
print dtd.validate(tree)
Run Code Online (Sandbox Code Playgroud)
输出:
% python nitf_test.py
True
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2769 次 |
| 最近记录: |