Sco*_*oby 18 python xml xsd xml-validation python-2.7
我有一个XML文件,我有一个XML模式.我想根据该模式验证该文件,并检查它是否符合该模式.我正在使用python,但如果在python中没有这样有用的库,我会对任何语言开放.
这里最好的选择是什么?我担心我能以多快的速度运行它.
ale*_*cxe 25
当然lxml.
XMLParser使用预定义的模式定义一个,加载该文件fromstring()并捕获任何XML Schema错误:
from lxml import etree
def validate(xmlparser, xmlfilename):
try:
with open(xmlfilename, 'r') as f:
etree.fromstring(f.read(), xmlparser)
return True
except etree.XMLSchemaError:
return False
schema_file = 'schema.xsd'
with open(schema_file, 'r') as f:
schema_root = etree.XML(f.read())
schema = etree.XMLSchema(schema_root)
xmlparser = etree.XMLParser(schema=schema)
filenames = ['input1.xml', 'input2.xml', 'input3.xml']
for filename in filenames:
if validate(xmlparser, filename):
print("%s validates" % filename)
else:
print("%s doesn't validate" % filename)
Run Code Online (Sandbox Code Playgroud)
如果模式文件包含带编码的xml标记(例如<?xml version="1.0" encoding="UTF-8"?>),则上面的代码将生成以下错误:
Traceback (most recent call last):
File "<input>", line 2, in <module>
schema_root = etree.XML(f.read())
File "src/lxml/etree.pyx", line 3192, in lxml.etree.XML
File "src/lxml/parser.pxi", line 1872, in lxml.etree._parseMemoryDocument
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.
Run Code Online (Sandbox Code Playgroud)
解决方案是以字节模式打开文件:open(..., 'rb')
[...]
def validate(xmlparser, xmlfilename):
try:
with open(xmlfilename, 'rb') as f:
[...]
with open(schema_file, 'rb') as f:
[...]
Run Code Online (Sandbox Code Playgroud)