Cha*_*hra 6 php redirect curl dom simple-html-dom
我在这个网站上阅读了20多个相关问题,在谷歌搜索但没有用.我是PHP的新手,我正在使用PHP Simple HTML DOM Parser来获取URL.虽然此脚本适用于本地测试页,但它不适用于我需要脚本的URL.
以下是我为此编写的代码,遵循PHP Simple DOM解析器库附带的示例文件:
<?php
include('simple_html_dom.php');
$html = file_get_html('http://www.farmersagent.com/Results.aspx?isa=1&name=A&csz=AL');
foreach($html->find('li.name ul#generalListing') as $e)
echo $e->plaintext;
?>
Run Code Online (Sandbox Code Playgroud)
这是我得到的错误消息:
Warning: file_get_contents(http://www.farmersagent.com/Results.aspx?isa=1&name=A&csz=AL) [function.file-get-contents]: failed to open stream: Redirection limit reached, aborting in /home/content/html/website.in/test/simple_html_dom.php on line 70
Run Code Online (Sandbox Code Playgroud)
请指导我应该做些什么来使它工作.我是新人,所以请提出一个简单的方法.在阅读本网站上的其他问题及其答案时,我尝试使用cURL方法创建句柄,但是我没能使它工作.我尝试的cURL方法不断返回"资源"或"对象".我不知道如何将它传递给Simple HTML DOM Parser以使$ html-> find()正常工作.
请帮忙!谢谢!
今天有类似的问题.我使用CURL并没有返回任何错误.用file_get_contents()测试,我得到了......
无法打开流:已达到重定向限制,正在中止
做了一些搜索,我结束了这个功能,适用于我的情况......
function getPage ($url) {
$useragent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.89 Safari/537.36';
$timeout= 120;
$dir = dirname(__FILE__);
$cookie_file = $dir . '/cookies/' . md5($_SERVER['REMOTE_ADDR']) . '.txt';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt($ch, CURLOPT_ENCODING, "" );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt($ch, CURLOPT_AUTOREFERER, true );
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout );
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout );
curl_setopt($ch, CURLOPT_MAXREDIRS, 10 );
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com/');
$content = curl_exec($ch);
if(curl_errno($ch))
{
echo 'error:' . curl_error($ch);
}
else
{
return $content;
}
curl_close($ch);
}
Run Code Online (Sandbox Code Playgroud)
该网站正在检查有效的用户代理和cookie.
cookie问题导致了它!:)和平!
解决方法:
<?php
$context = stream_context_create(
array(
'http' => array(
'max_redirects' => 101
)
)
);
$content = file_get_contents('http://example.org/', false, $context);
?>
Run Code Online (Sandbox Code Playgroud)
您还可以告知您中间是否有代理:
$aContext = array('http'=>array('proxy'=>$proxy,'request_fulluri'=>true));
$cxContext = stream_context_create($aContext);
Run Code Online (Sandbox Code Playgroud)
更多详细信息:https://cweiske.de/tagebuch/php-redirection-limit-reached.htm(感谢@jqpATs2w)
使用 cURL,您需要将 CURLOPT_RETURNTRANSFER 选项设置为 true 才能通过调用返回请求正文,如下所示curl_exec
:
$url = 'http://www.farmersagent.com/Results.aspx?isa=1&name=A&csz=AL';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
// you may set this options if you need to follow redirects. Though I didn't get any in your case
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
$content = curl_exec($curl);
curl_close($curl);
$html = str_get_html($content);
Run Code Online (Sandbox Code Playgroud)