尝试使用卷曲:
/**
* Get a web file (HTML, XHTML, XML, image, etc.) from a URL. Return an
* array containing the HTTP server response header fields and content.
*/
function get_web_page( $url )
{
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "spider", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['content'] = $content;
return $header;
}
Run Code Online (Sandbox Code Playgroud)
只需用你的url调用该函数,它应该将整个网页回显到php页面.
但是,您可能需要使用某些正则表达式重写资源链接,例如样式表和图像.(将"/image.jpg"替换为" http://mydomain.com/image.jpg ").
Curl通常安装在共享主机上.
如果你想获取页面的主体或头部,你可以使用simplexml或regex表达式.(如果html格式正确,simplexml非常适合遍历DOM).
| 归档时间: |
|
| 查看次数: |
3395 次 |
| 最近记录: |