相关疑难解决方法(0)

如何使用lxml验证多个xsd架构?

我正在编写一个单元测试,通过获取xsd模式并使用python的lxml库验证来验证我生成的sitemap xml:

这是我的根元素的一些元数据:

xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd 
http://www.google.com/schemas/sitemap-image/1.1 
http://www.google.com/schemas/sitemap-image/1.1/sitemap-image.xsd"
Run Code Online (Sandbox Code Playgroud)

而这个测试代码:

_xsd_validators = {}
def get_xsd_validator(url):
    if url not in _xsd_validators:
        _xsd_validators[url] = etree.XMLSchema(etree.parse(StringIO(requests.get(url).content)))
    return _xsd_validators[url]


# this util function is later on in a TestCase
def validate_xml(self, content):
    content.seek(0)
    doc = etree.parse(content)
    schema_loc = doc.getroot().attrib.get('{http://www.w3.org/2001/XMLSchema-instance}schemaLocation').split(' ')
    # lxml doesn't like multiple namespaces
    for i, loc in enumerate(schema_loc):
        if i % 2 == 1:
            get_xsd_validator(schema_loc[i]).assertValid(doc)
    return doc
Run Code Online (Sandbox Code Playgroud)

验证失败的示例XML:

<?xml version="1.0" encoding="UTF-8"?>
<urlset
  xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
  xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="
    http://www.sitemaps.org/schemas/sitemap/0.9
    http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd
    http://www.google.com/schemas/sitemap-image/1.1
    http://www.google.com/schemas/sitemap-image/1.1/sitemap-image.xsd"
> …
Run Code Online (Sandbox Code Playgroud)

python xml xsd lxml

2
推荐指数
1
解决办法
1194
查看次数

标签 统计

lxml ×1

python ×1

xml ×1

xsd ×1