我有这样的HTML代码:
<strong>Term:</strong>
Some text<br />
More text<br />
Some more lines of text
<strong>Term:</strong>
Some text<br />
More text<br />
Some more lines of text
<strong>Second term:</strong>
Some text<br />
More text<br />
Some more lines of text
<strong>Term:</strong>
Some text<br />
More text<br />
Some more lines of text
Run Code Online (Sandbox Code Playgroud)
我需要在标签与文本"Term"之间和下一个标签之前获取文本节点:
Some text
More text
Some more lines of text
Some text
More text
Some more lines of text
Some text
More text
Some more lines of text
Run Code Online (Sandbox Code Playgroud)
这里可以使用条件:previous标签必须包含文本"Term",但我不知道如何创建这样的xpath选择器.
//text()[preceding::*[contains(text(),'Term:')] and following::*[contains(text(),'Term:')]]
Run Code Online (Sandbox Code Playgroud)
它与empo建议的相同.但是,我正在寻找一个包含Term的节点,并返回它们之间存在的所有文本节点.
但是,只有当您没有任何其他"术语"时,这才能正常工作.如果是这种情况,请告诉我,因为这个Xpath也将返回一些不需要的值.
从现在开始,您已更新输入.我只是在上一个Xpath中添加了一个条件.
//text()[preceding::*[contains(text(),'Term:')] and following::*[contains(text(),'Term:')] and not(contains(., 'Term:'))]
Run Code Online (Sandbox Code Playgroud)
@empo解决方案也有效.但我们考虑<strong>到了这一点.我编写的xpath只检查单词'Term:'并给出它们之间的所有textNodes.
如果这对您有用,请告诉我.
问候.
您的问题仍然不明确,您的输入文档格式不正确。检查一下:
root/text()[preceding::strong[1][contains(text(),'Term')]]
Run Code Online (Sandbox Code Playgroud)
应用于:
<root>
<strong>Term:</strong>
Some text<br />
More text<br />
Some more lines of text
<strong>Term:</strong>
Some text2<br />
More text2<br />
Some more lines of text2
<strong>Second term:</strong>
Some text3<br />
More text3<br />
Some more lines of text3
<strong>Term:</strong>
Some text4<br />
More text4<br />
Some more lines of text4
</root>
Run Code Online (Sandbox Code Playgroud)
产生:
Some text
More text
Some more lines of text
Some text2
More text2
Some more lines of text2
Some text4
More text4
Some more lines of text4
Run Code Online (Sandbox Code Playgroud)
此 XPath 选择包含字符串的元素Term:和包含任何字符串的元素之间的所有文本节点:
//text()[preceding::*[contains(text(),'Term:')] and following::*[text()]]
Run Code Online (Sandbox Code Playgroud)
应用于:
<root>
<strong>Term:</strong>
Some text<br />
More text<br />
Some more lines of text
<strong>Second term:</strong>
Some text2<br />
More text2<br />
Some more lines of text2
</root>
Run Code Online (Sandbox Code Playgroud)
返回:
Some text
More text
Some more lines of text
Run Code Online (Sandbox Code Playgroud)