PHP Simple HTML DOM Parser查找字符串

Question

PHP Simple HTML DOM Parser查找字符串

我使用的是PHP简单的DOM解析器,但它似乎没有搜索文本的功能.我需要搜索一个字符串并找到它的父ID.基本上与正常使用相反.

谁知道怎么样？

Answer 1

$html = file_get_html('http://www.google.com/');

$eles = $html->find('*');
foreach($eles as $e) {
    if(strpos($e->innertext, 'theString') !== false) {
        echo $e->id;
    }
}

Run Code Online (Sandbox Code Playgroud)

http://simplehtmldom.sourceforge.net/manual.htm

你有没有得到这个可行的解决方案？ (2认同)

Answer 2

wak*_*spb 5

想象一下，任何标签都有一个“纯文本”属性并使用标准属性选择器。

所以，HTML：

<div id="div1">
  <span>London is the capital</span> of Great Britain
</div>
<div id="div2">
  <span>Washington is the capital</span> of the USA
</div>

Run Code Online (Sandbox Code Playgroud)

可以想象为：

<div id="div1" plaintext="London is the capital  of Great Britain">
  <span plaintext="London is the capital ">London is the capital</span> of Great Britain
</div>
<div id="div2" plaintext="Washington is the capital  of the USA">
  <span plaintext="Washington is the capital ">Washington is the capital</span> of the USA
</div>

Run Code Online (Sandbox Code Playgroud)

而 PHP 来解决您的任务只是：

<?php
  $t = '
    <div id="div1">
      <span>London is the capital</span> of Great Britain
    </div>
    <div id="div2">
      <span>Washington is the capital</span> of the USA
    </div>';
  $html = str_get_html($t);
  $foo = $html->find('span[plaintext^=London]');
  echo "ID: " . $foo[0]->parent()->id; // div1
?>

Run Code Online (Sandbox Code Playgroud)

（请记住，<span>标签的“纯文本”用空格符号右填充；这是 Simple HTML DOM 的默认行为，由 constant 定义DEFAULT_SPAN_TEXT）

Answer 3

Wri*_*ken 3

$d = new DOMDocument();
$d->loadXML($xml);
$x = new DOMXPath($d);
$result = $x->evaluate("//text()[contains(.,'617.99')]/ancestor::*/@id");
$unique = null;
for($i = $result->length -1;$i >= 0 && $item = $result->item($i);$i--){
    if($x->query("//*[@id='".addslashes($item->value)."']")->length == 1){
        echo 'Unique ID is '.$item->value."\n";
            $unique = $item->value;
        break;
    }
}
if(is_null($unique)) echo 'no unique ID found';

Run Code Online (Sandbox Code Playgroud)

这是 PHP 的 `DOMDocument`，而不是 OP 所说的他正在使用的 [SimpleHTMLDom Library](http://simplehtmldom.sourceforge.net/)。 (2认同)
嗯，错过了。我仍然无法理解人们使用那种缓慢、缓慢的thingamajig，但你是对的，这不是OP正在寻找的答案。 (2认同)

归档时间：	14 年，9 月前
查看次数：	25377 次
最近记录：	10 年，6 月前