如何在字符串中找到所有HTML超链接标记并用它们的href值替换它们?

hsa*_*ite 2 html php hyperlink

我想获取一串文本并查找所有超链接标记,获取其href值,并使用href属性的值替换整个超链接标记.

Vol*_*erK 5

很多可能性.例如,通过使用DOM扩展,DOMDocument :: loadhtml()XPath(尽管getElementsbyTagName()在这种情况下就足够了).

<?php
$string = '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd"><html><head><title>...</title></head><body>
  <p>
    mary had a <a href="little">greedy</a> lamb
    whose fleece was <a href="white">cold</a> as snow
  </p>
</body></html>';

$doc = new DOMDocument;
$doc->loadhtml($string);

$xpath = new DOMXPath($doc);
foreach( $xpath->query('//a') as $a ) {
  $tn = $doc->createTextNode($a->getAttribute('href'));
  $a->parentNode->replaceChild($tn, $a);
}

echo $doc->savehtml();
Run Code Online (Sandbox Code Playgroud)

版画

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head><title>...</title></head>
<body><p>
    mary had a little lamb
    whose fleece was white as snow
  </p></body>
</html>
Run Code Online (Sandbox Code Playgroud)

  • @hsatterwhite:请参阅http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html正则表达式可能是解析器的_part_,但单独的正则表达式无法完成工作(对于任意html) (2认同)