PHP:从HTML字符串中删除特定标记?

cod*_*ama 20 html php tags replace domparser

我有以下html:

<html>
 <body>
 bla bla bla bla
  <div id="myDiv"> 
         more text
      <div id="anotherDiv">
           And even more text
      </div>
  </div>

  bla bla bla
 </body>
</html>
Run Code Online (Sandbox Code Playgroud)

我希望从<div id="anotherDiv">关闭之前删除所有内容<div>.我怎么做?

Gor*_*don 33

使用原生DOM

$dom = new DOMDocument;
$dom->loadHTML($htmlString);
$xPath = new DOMXPath($dom);
$nodes = $xPath->query('//*[@id="anotherDiv"]');
if($nodes->item(0)) {
    $nodes->item(0)->parentNode->removeChild($nodes->item(0));
}
echo $dom->saveHTML();
Run Code Online (Sandbox Code Playgroud)


Hai*_*vgi 14

您可以使用preg_replace():

$string = preg_replace('/<div id="someid"[^>]+\>/i', "", $string);
Run Code Online (Sandbox Code Playgroud)

  • 他更新了问题..,现在我更新了我的 (2认同)

Raf*_*shi 5

除了Haim Evgi的回答,使用libxml:

功能

$html='<html>
 <body>
 bla bla bla bla
  <div id="myDiv"> 
         more text
      <div id="anotherDiv">
           And even more text
      </div>
  </div>

  bla bla bla
 </body>
</html>';
Run Code Online (Sandbox Code Playgroud)

编辑

处理 doctype

    $dom=new DOMDocument;

    $dom->validateOnParse = false;

    $dom->loadHTML( $html );

    // get the tag

    $div = $dom->getElementById('anotherDiv');

   // delete the tag

    if( $div && $div->nodeType==XML_ELEMENT_NODE ){

        $div->parentNode->removeChild( $div );
    }

    echo $dom->saveHTML();
Run Code Online (Sandbox Code Playgroud)

资源

https://gist.github.com/rafasashi/59c9448f5467ea427fa3