Jac*_* M. 7 python xml xpath lxml
我正在尝试理解和发送给我使用ACORD XML表单(保险中的通用格式)的XPath.他们发给我的XPath是(为简洁而截断):
./PersApplicationInfo/InsuredOrPrincipal[InsuredOrPrincipalInfo/InsuredOrPrincipalRoleCd="AN"]/GeneralPartyInfo
Run Code Online (Sandbox Code Playgroud)
我遇到麻烦的地方是Python的lxml
库告诉我这[InsuredOrPrincipalInfo/InsuredOrPrincipalRoleCd="AN"]
是一个invalid predicate
.我无法在XPath规范中的任何地方找到标识此语法的谓词,以便我可以修改此谓词.
是否有关于此谓词选择的文档?此外,这甚至是一个有效的谓词,还是在某个地方被破坏了?
可能相关:
我相信我正在使用的公司是一个MS商店,所以这个XPath可能在C#或该堆栈中的其他语言中有效吗?我不完全确定.
更新:
根据评论需求,这里有一些额外的信息.
XML示例:
<ACORD>
<InsuranceSvcRq>
<HomePolicyQuoteInqRq>
<PersPolicy>
<PersApplicationInfo>
<InsuredOrPrincipal>
<InsuredOrPrincipalInfo>
<InsuredOrPrincipalRoleCd>AN</InsuredOrPrincipalRoleCd>
</InsuredOrPrincipalInfo>
<GeneralPartyInfo>
<Addr>
<Addr1></Addr1>
</Addr>
</GeneralPartyInfo>
</InsuredOrPrincipal>
</PersApplicationInfo>
</PersPolicy>
</HomePolicyQuoteInqRq>
</InsuranceSvcRq>
</ACORD>
Run Code Online (Sandbox Code Playgroud)
代码示例(使用完整的XPath而不是代码段):
>>> from lxml import etree
>>> tree = etree.fromstring(raw)
>>> tree.find('./InsuranceSvcRq/HomePolicyQuoteInqRq/PersPolicy/PersApplicationInfo/InsuredOrPrincipal[InsuredOrPrincipalInfo/InsuredOrPrincipalRoleCd="AN"]/GeneralPartyInfo/Addr/Addr1')
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "lxml.etree.pyx", line 1409, in lxml.etree._Element.find (src/lxml/lxml.etree.c:39972)
File "/Library/Python/2.5/site-packages/lxml-2.3-py2.5-macosx-10.3-i386.egg/lxml/_elementpath.py", line 271, in find
it = iterfind(elem, path, namespaces)
File "/Library/Python/2.5/site-packages/lxml-2.3-py2.5-macosx-10.3-i386.egg/lxml/_elementpath.py", line 261, in iterfind
selector = _build_path_iterator(path, namespaces)
File "/Library/Python/2.5/site-packages/lxml-2.3-py2.5-macosx-10.3-i386.egg/lxml/_elementpath.py", line 245, in _build_path_iterator
selector.append(ops[token[0]](_next, token))
File "/Library/Python/2.5/site-packages/lxml-2.3-py2.5-macosx-10.3-i386.egg/lxml/_elementpath.py", line 207, in prepare_predicate
raise SyntaxError("invalid predicate")
SyntaxError: invalid predicate
Run Code Online (Sandbox Code Playgroud)
unu*_*tbu 18
更改tree.find
到tree.xpath
.find
和findall
存在于LXML以提供与ElementTree的的其它实现兼容性.这些方法不实现整个XPath语言.要使用包含更多高级功能的XPath表达式,请使用xpath
方法,XPath
类或XPathEvaluator
.
例如:
import io
import lxml.etree as ET
content='''\
<ACORD>
<InsuranceSvcRq>
<HomePolicyQuoteInqRq>
<PersPolicy>
<PersApplicationInfo>
<InsuredOrPrincipal>
<InsuredOrPrincipalInfo>
<InsuredOrPrincipalRoleCd>AN</InsuredOrPrincipalRoleCd>
</InsuredOrPrincipalInfo>
<GeneralPartyInfo>
<Addr>
<Addr1></Addr1>
</Addr>
</GeneralPartyInfo>
</InsuredOrPrincipal>
</PersApplicationInfo>
</PersPolicy>
</HomePolicyQuoteInqRq>
</InsuranceSvcRq>
</ACORD>
'''
tree=ET.parse(io.BytesIO(content))
path='//PersApplicationInfo/InsuredOrPrincipal[InsuredOrPrincipalInfo/InsuredOrPrincipalRoleCd="AN"]/GeneralPartyInfo'
result=tree.xpath(path)
print(result)
Run Code Online (Sandbox Code Playgroud)
产量
[<Element GeneralPartyInfo at b75a8194>]
Run Code Online (Sandbox Code Playgroud)
而tree.find
收益率
SyntaxError: invalid node predicate
Run Code Online (Sandbox Code Playgroud)