我使用简单的html dom解析器来解析一些html.
我有这样的HTML
<span class="UIStory_Message">
Yeah, elixir of life!<br/>
<a href="asdfasdf">
<span>asdfsdfasdfsdf</span>
<wbr/>
<span class="word_break"/>
61193133389&ref=nf
</a>
</span>
Run Code Online (Sandbox Code Playgroud)
我的代码是
$storyMessageNodes = $story->find('span.UIStory_Message');
$storyMessage = strip_tags($storyMessageNodest->innertext);
Run Code Online (Sandbox Code Playgroud)
我想在跨度"UIStory_Message"中找到正确的文本.即,"是的,生命的灵丹妙药!".
但上面的代码给出了整个范围内的整个文本.即,"是的,生命的灵丹妙药!asdfsdfasdfsdf 61193133389&ref = nf"
我怎么能编码使它只给出"是啊,生命的灵丹妙药!" ??
我已经写了一个方法来摆脱获取的DOM节点中不需要的元素,我已经联系了作者,但是简单的dom已经活了两年了,所以我怀疑他会把它包含在发行版中.这里是:
/**
* remove specified nodes from selected dom
*
* @param string $selector
* @param int|array (optional) possible values include:
* + positive integer - remove first denoted number of elements
* + negative integer - remove last denoted number of elements
* + array of ones and zeroes - remove the respective matches that equal to one
*
* eg.
* // will remove first two images found in node
* $dom->removeNodes('img',2);
*
* // will remove last two images found in node
* $dom->removeNodes('img',-2);
*
* // will remove all but the third images found in node
* $dom->removeNodes('img',array(1,1,0,1));
*
* [!!!] if there are more matches found than elements in array, the last array member will be used for processing
*
* eg.
* // will remove second and every following image
* $dom->removeNodes('img',array(0,1));
*
* // will remove only the second image
* $dom->removeNodes('img',array(0,1,0));
*
* @return simple_html_dom_node
*/
public function removeNodes($selector, $limit = NULL)
{
$elements = $this->find($selector);
if ( empty($elements) ) return $this;
if ( isset($limit) && is_int( $limit ) && $limit < 0 ) {
$limit = abs( $limit );
$elements = array_reverse( $elements );
}
foreach ( $elements as $element ) {
if ( isset($limit) ) {
if ( is_array( $limit ) ) {
$current = current( $limit );
if ( next( $limit ) === FALSE ) {
end( $limit );
}
if ( !$current ) {
continue;
}
} else {
if ( --$limit === -1 ) {
return $this;
}
}
}
$element->outertext = '';
}
return $this;
}
Run Code Online (Sandbox Code Playgroud)
把它放在simple_html_dom_node课堂上或延伸它.在askers案例中你会像这样使用它:
$storyMessageNodes = $story->find('span.UIStory_Message');
$storyMessage = $storyMessageNodes[0]->removeNodes('a')->plaintext
Run Code Online (Sandbox Code Playgroud)