我正在尝试解析一个html文件.
我们的想法是获取带有title和desc类的跨度,并在每个具有属性class ='thebest'的div中获取它们的信息.
这是我的代码:
<?php
$example=<<<KFIR
<html>
<head>
<title>test</title>
</head>
<body>
<div class="a">moshe1
<div class="aa">haim</div>
</div>
<div class="a">moshe2</div>
<div class="b">moshe3</div>
<div class="thebest">
<span class="title">title1</span>
<span class="desc">desc1</span>
</div>
<div class="thebest">
span class="title">title2</span>
<span class="desc">desc2</span>
</div>
</body>
</html>
KFIR;
$doc = new DOMDocument();
@$doc->loadHTML($example);
$xpath = new DOMXPath($doc);
$expression="//div[@class='thebest']";
$arts = $xpath->query($expression);
foreach ($arts as $art) {
$arts2=$xpath->query("//span[@class='title']",$art);
echo $arts2->item(0)->nodeValue;
$arts2=$xpath->query("//span[@class='desc']",$art);
echo $arts2->item(0)->nodeValue;
}
echo "done";
Run Code Online (Sandbox Code Playgroud)
预期的结果是:
title1desc1title2desc2done
Run Code Online (Sandbox Code Playgroud)
我收到的结果是:
title1desc1title1desc1done
Run Code Online (Sandbox Code Playgroud)
sal*_*the 11
使查询相对...用点(例如".//…")开始它们.
foreach ($arts as $art) {
// Note: single slash (direct child)
$titles = $xpath->query("./span[@class='title']", $art);
if ($titles->length > 0) {
$title = $titles->item(0)->nodeValue;
echo $title;
}
$descs = $xpath->query("./span[@class='desc']", $art);
if ($descs->length > 0) {
$desc = $descs->item(0)->nodeValue;
echo $desc;
}
}
Run Code Online (Sandbox Code Playgroud)