Jos*_*Cox 4 html php xpath dom
我想把字符串"hinson lou ann"排除在外:
<div class='owner-name'>hinson lou ann</div>
Run Code Online (Sandbox Code Playgroud)
当我运行以下内容时:
$html = "http://gisapps.co.union.nc.us/ws/rest/v2/cm_iw.ashx?gid=12339";
$doc = new DOMDocument();
$doc->loadHTMLFile($html);
$xpath = new DOMXpath($doc);
$elements = $xpath->query("*/div[@class='owner-name']");
if (!is_null($elements)) {
foreach ($elements as $element) {
echo "<br/>[" . $element->nodeName . "]";
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo $node->nodeValue . "\n";
}
}
}
Run Code Online (Sandbox Code Playgroud)
我得到一个错误:
警告:DOMDocument :: loadHTMLFile()[domdocument.loadhtmlfile]:htmlParseEntityRef:http://gisapps.co.union.nc.us/ws/rest/v2/cm_iw.ashx?gid = 12339 ,line:1中没有名称在/ home ...在线......
哪个指的是行loadHTMLFILE.
注意:该文件无效HTML只包含div标签!我加载文件然后在其body上打了HTML 标签是什么?
如果你真的必须尝试解析它,试试这个:
<?php
$html = file_get_contents("http://gisapps.co.union.nc.us/ws/rest/v2/cm_iw.ashx?gid=12339");
$doc = new DOMDocument();
$doc->strictErrorChecking = false;
$doc->recover=true;
@$doc->loadHTML("<html><body>".$html."</body></html>");
$xpath = new DOMXpath($doc);
$elements = $xpath->query("//*/div[@class='owner-name']");
if (!is_null($elements)) {
foreach ($elements as $element) {
echo "<br/>[". $element->nodeName. "]";
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo $node->nodeValue. "\n";
}
}
}
?>
Run Code Online (Sandbox Code Playgroud)
PS:你的XPath错了,我修好了.你$nodes没有任何东西,因为那个DIV元素(.owner-name)没有任何孩子..所以你需要修改它.