我有些时候通过curl方法获取url数据的问题特别是网站数据是阿拉伯语等其他语言我的curl函数是
function file_get_contents_curl($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$data = curl_exec($ch);
$info = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);
//checking mime types
if(strstr($info,'text/html')) {
curl_close($ch);
return $data;
} else {
return false;
}
}
Run Code Online (Sandbox Code Playgroud)
以及我如何获取数据
$html = file_get_contents_curl($checkurl);
$grid ='';
if($html)
{
$doc = new DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');
@$title = $nodes->item(0)->nodeValue;
@$metas = $doc->getElementsByTagName('meta');
for ($i = 0; $i < $metas->length; $i++)
{
$meta = $metas->item($i);
if($meta->getAttribute('name') == 'description')
$description = $meta->getAttribute('content');
}
Run Code Online (Sandbox Code Playgroud)
我正在从一些阿拉伯网站上正确地获取所有数据,如
http://www.emaratalyoum.com/multimedia/videos/2012-04-08-1.474873
,当我给你这个youtube网址
http://www.youtube.com/watch ?v = Eyxljw31TtU&feature = g-logo&context = G2c4f841FOAAAAAAAFAA
它显示符号..我必须做什么设置来显示完全相同的标题描述.
获取阿拉伯语可能非常棘手,但它们是您需要确保的一些基本步骤
UTF-8获取Youtube信息时,它已经以"UTF-8"格式提供信息,并且检索过程添加了添加UTF-8编码....不确定为什么会发生这种情况,但是一个简单utf8_decode的问题就可以解决问题
header('Content-Type: text/html; charset=UTF-8');
echo displayMeta("http://www.emaratalyoum.com/multimedia/videos/2012-04-08-1.474873");
echo displayMeta("http://www.youtube.com/watch?v=Eyxljw31TtU&feature=g-logo&context=G2c4f841FOAAAAAAAFAA");
Run Code Online (Sandbox Code Playgroud)
emaratalyoum.com
?????? ????? ???????? ???? ???? ???? ????? ???? ?????? ?? ???? ???? ??? ????? ?? ????? ?????? ?????? ?????? ?? ????? ??????? ?? ???? ??? ???????? ????? ?????
Run Code Online (Sandbox Code Playgroud)
youtube.com
??????.??? ????? ?????? ??? ??????? ??? ?????? ???? ????? ?? ????? ?????? ??? ???? ??? ?? ??? ??????? ???????: ???? "???? ?????? ??? ??? ??????"
Run Code Online (Sandbox Code Playgroud)
displayMeta
function displayMeta($checkurl) {
$html = file_get_contents_curl($checkurl);
$grid = '';
if ($html) {
$doc = new DOMDocument("1.0","UTF-8");
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');
$title = $nodes->item(0)->nodeValue;
$metas = $doc->getElementsByTagName('meta');
for($i = 0; $i < $metas->length; $i ++) {
$meta = $metas->item($i);
if ($meta->getAttribute('name') == 'description') {
$description = $meta->getAttribute('content');
if (stripos(parse_url($checkurl, PHP_URL_HOST), "youtube") !== false)
return utf8_decode($description);
else {
return $description;
}
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
*file_get_contents_curl*
function file_get_contents_curl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$data = curl_exec($ch);
$info = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);
// checking mime types
if (strstr($info, 'text/html')) {
curl_close($ch);
return $data;
} else {
return false;
}
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1044 次 |
| 最近记录: |