XPathEvalError:lxml中matches()的未注册函数

Abt*_*Pst 5 python xpath lxml xpath-2.0

我试图在python中使用以下xpath查询

from lxml.html.soupparser import fromstring
root = fromstring(inString)
nodes = root.xpath(".//p3[matches(.,'ABC')]//preceding::p2//p3")
Run Code Online (Sandbox Code Playgroud)

但它给了我错误

  nodes = root.xpath(".//p3[matches(.,'ABC')]//preceding::p2//p3")
  File "lxml.etree.pyx", line 1507, in lxml.etree._Element.xpath (src\lxml\lxml.etree.c:52198)
  File "xpath.pxi", line 307, in lxml.etree.XPathElementEvaluator.__call__ (src\lxml\lxml.etree.c:152124)
  File "xpath.pxi", line 227, in lxml.etree._XPathEvaluatorBase._handle_result (src\lxml\lxml.etree.c:151097)
  File "xpath.pxi", line 212, in lxml.etree._XPathEvaluatorBase._raise_eval_error (src\lxml\lxml.etree.c:150896)
  lxml.etree.XPathEvalError: Unregistered function
Run Code Online (Sandbox Code Playgroud)

我如何在这里使用XPath 2.0函数与lxml?

澄清

我之前使用的是contains函数

nodes = root.xpath(".//p3[contains(text(),'ABC')]//preceding::p2//p3")
Run Code Online (Sandbox Code Playgroud)

问题是我的xml在文本中有换行符和空格,因此我尝试使用类似的东西

nodes = root.xpath(".//p3[contains(normalize-space(),'ABC')]//preceding::p2//p3")
Run Code Online (Sandbox Code Playgroud)

但这没有效果.最后我尝试使用匹配功能,我得到了错误.

示例XML

<doc>

<q></q>

<p1>
    <p2 dd="ert" ji="pp">

        <p3>1</p3>
        <p3>2</p3>
        <p3>
               ABC
        </p3>
        <p3>3</p3>

     </p2>

     <p2 dd="ert" ji="pp">

        <p3>4</p3>
        <p3>5</p3>
        <p3>ABC</p3>
        <p3>6</p3>

     </p2>

</p1>
<r></r>
<p1>
    <p2 dd="ert" ji="pp">

        <p3>7</p3>
        <p3>8</p3>
        <p3>ABC
        </p3>
        <p3>9</p3>

     </p2>

     <p2 dd="ert" ji="pp">

        <p3>10</p3>
        <p3>11</p3>
        <p3>ABC</p3>
        <p3>12</p3>

     </p2>

</p1>
</doc>
Run Code Online (Sandbox Code Playgroud)

har*_*r07 9

正如另一个答案中所提到的,强调引用文档的另一部分,您可以使用EXSLT扩展来match()使用lxml 的正则表达式函数,例如:

......
ns = {"re": "http://exslt.org/regular-expressions"}
nodes = root.xpath(".//p3[re:match(.,'ABC')]//preceding::p2//p3", namespaces=ns)
Run Code Online (Sandbox Code Playgroud)