DOM xpath查找#text节点并在段落标记中包装

Xeo*_*oss 10 html php xpath dom

我想找到所有根级别的#text节点(或带有div父节点的节点),它们应该包含在<p>标签内.在下文中,应该有三个(甚至只是两个)最终根<p>标签.

<div>
    This text should be wrapped in a p tag.
</div>

This also should be wrapped.

<b>And</b> this.
Run Code Online (Sandbox Code Playgroud)

我们的想法是将文本格式化得更好,以便将文本块分组为HTML显示的段落.但是,我一直在研究的以下xpath似乎无法选择文本节点.

    <?php

$html = '<div>
    This text should be wrapped in a p tag.
</div>

This also should be wrapped.

<b>And</b> this.';

libxml_use_internal_errors(TRUE);

$dom = DOMDocument::loadHTML($html);

$xp = new DOMXPath($dom);

$xpath = '//text()[not(parent::p) and normalize-space()]';

foreach($xp->query($xpath) as $node) {
    $element = $dom->createElement('p');
    $node->parentNode->replaceChild($element, $node);
    $element->appendChild($node);
}

print $dom->saveHTML();
Run Code Online (Sandbox Code Playgroud)

nwe*_*hof 7

好的,让我把我的评论重新解释为答案.如果要匹配所有文本节点,则只需//div从XPath表达式中删除该部分即可.所以它变成:

//text()[not(parent::p) and normalize-space()]
Run Code Online (Sandbox Code Playgroud)