我试图使用PHPquery来抓取网页(free-lance.ru)
Simple HTML Dom中的Equiv代码正在工作:
include('simple_html_dom.php');
$shd = str_get_html($html);
$projects = array();
$i = 0;
foreach ($shd->find('.project-preview') as $work){
$projects[$i]['name'] = $work->find('h3', 0)->children(1)->plaintext;
$i++;
}
Run Code Online (Sandbox Code Playgroud)
但我需要它在PHPQuery中.
我尝试使用类似的东西:
include('phpQuery.php');
$pq = phpQuery::newDocument($html);
foreach ($pq->find('.project-preview') as $work){
echo 'wow';
}
Run Code Online (Sandbox Code Playgroud)
但它不起作用... sizeof($ pq-> find('.project-preview'))为0
我将非常感谢任何帮助.
如何从HTML页面使用phpQuery获取img src?补充:我需要获取此“ src”以在解析器模块中用于drupal
编辑2
感谢大家的帮助!通过融合答案和其他一些论坛帖子我设法通过以下方式解决:
$string = strip_tags($oNode['div.item-prijs']);
$array = str_split($string,1);
$arraytotal = ( $array[0] . ',' . $array[1] . $array[2] );
echo $arraytotal;
Run Code Online (Sandbox Code Playgroud)
现在显示正确的价格."7,49"我自动转换为的PHP脚本.
对不起,我不能给更多的问题回答标记.案件结案.
拜托了伙计们
$price = strip_tags($oNode['div.item-prijs']);
$new_price = substr(chunk_split($price, 1, ','), 0, -1);
echo $new_price;
Run Code Online (Sandbox Code Playgroud)
这将回应7,4,9而不是7,49.但是这段代码是目前为止最好的代码.有人知道如何解决这个问题吗?
好吧,我现在已经坚持了一段时间..
我正在从网站解析数据,我想得到价格,但在网站上没有逗号或价格之间的点.所以它显示像499,4大于99.
当我做:
$price = $oNode['div.item-prijs'];
echo $price;
Run Code Online (Sandbox Code Playgroud)
它将回显499.我希望它添加逗号或4到99之间的点.
我试过了:
$price = $oNode['div.item-prijs'];
$new_price = substr(chunk_split($price, 1, ','), 0, -1);
echo $new_string;
Run Code Online (Sandbox Code Playgroud)
这将回应:
<,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,7,<,s,u,p,>,4,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,4,<,s,u,p,>,9,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,4,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,0,<,s,u,p,>,6,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,9,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,4,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,4,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,3,<,s,u,p,>,4,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,3,<,s,u,p,>,4,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,3,<,s,u,p,>,6,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,1,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,8,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,9,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,9,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,4,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,2,<,s,u,p,>,9,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,9,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, ,c,l,a,s,s,=,",i,t,e,m,-,p,r,i,j,s,",>,<,p,>,1,<,s,u,p,>,0,9,<,/,s,u,p,>,<,/,p,>,<,/,d,i,v,><,d,i,v, …Run Code Online (Sandbox Code Playgroud) 我试图使用PHPQuery获取给定页面上所有图像的所有链接.我正在使用PHPQuery的PHP支持语法.
这是我到目前为止的代码:
include('phpQuery-onefile.php');
$all = phpQuery::newDocumentFileHTML("http://www.mysite.com", $charset = 'utf-8');
// in theory this gives me all image sources
$images = $all->find('img')->attr('src');
// but if I do `echo $images;` what I get is the src to the first image
Run Code Online (Sandbox Code Playgroud)
出于好奇,我尝试过
$images = $all->find('img:first')->attr('src');
Run Code Online (Sandbox Code Playgroud)
和
$images = $all->find('img:last')->attr('src');
Run Code Online (Sandbox Code Playgroud)
并且它分别正确地打印了第一个和最后一个图像的地址,但是我怎么能得到所有链接的数组呢?