chr*_*ris 13 php dom html-parsing
我是PHP的DOM解析新手:
我有一个我试图解析的HTML文件.它有一堆像这样的DIV:
<div id="interestingbox">
<div id="interestingdetails" class="txtnormal">
<div>Content1</div>
<div>Content2</div>
</div>
</div>
<div id="interestingbox">
......
Run Code Online (Sandbox Code Playgroud)
我正在尝试使用php获取许多div框的内容.如何使用DOM解析器执行此操作?
谢谢!
ape*_*ari 20
首先我必须告诉你,你不能在两个不同的div上使用相同的id; 有关于这一点的课程.每个元素都应该有唯一的id.
使用id ="interestingbox"获取div内容的代码
$html = '
<html>
<head></head>
<body>
<div id="interestingbox">
<div id="interestingdetails" class="txtnormal">
<div>Content1</div>
<div>Content2</div>
</div>
</div>
<div id="interestingbox2"><a href="#">a link</a></div>
</body>
</html>';
$dom_document = new DOMDocument();
$dom_document->loadHTML($html);
//use DOMXpath to navigate the html with the DOM
$dom_xpath = new DOMXpath($dom_document);
// if you want to get the div with id=interestingbox
$elements = $dom_xpath->query("*/div[@id='interestingbox']");
if (!is_null($elements)) {
foreach ($elements as $element) {
echo "\n[". $element->nodeName. "]";
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo $node->nodeValue. "\n";
}
}
}
//OUTPUT
[div] {
Content1
Content2
}
Run Code Online (Sandbox Code Playgroud)
类的示例:
$html = '
<html>
<head></head>
<body>
<div class="interestingbox">
<div id="interestingdetails" class="txtnormal">
<div>Content1</div>
<div>Content2</div>
</div>
</div>
<div class="interestingbox"><a href="#">a link</a></div>
</body>
</html>';
//the same as before.. just change the xpath
[...]
$elements = $dom_xpath->query("*/div[@class='interestingbox']");
[...]
//OUTPUT
[div] {
Content1
Content2
}
[div] {
a link
}
Run Code Online (Sandbox Code Playgroud)
有关更多详细信息,请参阅DOMXPath页面.
我使用simplehtmldom作为开始使用它:
$html = file_get_html('example.com');
foreach ($html->find('div[id=interestingbox]') as $result)
{
echo $result->innertext;
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
20709 次 |
| 最近记录: |