Java:如何通过org.w3c.dom.document上的xpath字符串定位元素

Question

Java:如何通过org.w3c.dom.document上的xpath字符串定位元素

如何通过给定org.w3c.dom.document上的xpath字符串快速定位元素/元素？似乎没有FindElementsByXpath()方法.例如

/html/body/p/div[3]/a

Run Code Online (Sandbox Code Playgroud)

我发现,当存在大量同名元素时,递归迭代所有子节点级别会非常慢.有什么建议？

我不能使用任何解析器或库,只能使用w3c dom文件.

Answer 1

Tom*_*icz 91

试试这个:

//obtain Document somehow, doesn't matter how
DocumentBuilder b = DocumentBuilderFactory.newInstance().newDocumentBuilder();
org.w3c.dom.Document doc = b.parse(new FileInputStream("page.html"));

//Evaluate XPath against Document itself
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList)xPath.evaluate("/html/body/p/div[3]/a",
        doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); ++i) {
    Element e = (Element) nodes.item(i);
}

Run Code Online (Sandbox Code Playgroud)

使用以下page.html文件:

<html>
  <head>
  </head>
  <body>
  <p>
    <div></div>
    <div></div>
    <div><a>link</a></div>
  </p>
  </body>
</html>

Run Code Online (Sandbox Code Playgroud)

归档时间：	14 年，8 月前
查看次数：	50853 次
最近记录：	7 年，10 月前