渲染时,cURL Html输出与原始页面不同

Ken*_*edy 4 html php curl file-get-contents

我正在开发一个涉及使用cURL或file_get_contents获取页面的项目.问题是,当我尝试回显所提取的html时,输出似乎与原始页面不同,并非所有图像都显示出来.请问我想知道是否有解决方案.我的代码

    <?php
    //Get the url
    $url = "http://www.google.com";

    //Get the html of url
    function get_data($url) 
    { 
       $ch = curl_init();
       $timeout = 5;
       //$userAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US)AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.X.Y.Z Safari/525.13.";
       $userAgent = "IE 7 – Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)";
      curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
      curl_setopt($ch, CURLOPT_FAILONERROR, true);
      curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
      curl_setopt($ch, CURLOPT_AUTOREFERER, true);
      curl_setopt($ch, CURLOPT_TIMEOUT, 10);
      curl_setopt($ch,CURLOPT_URL,$url);
      curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
      curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
      $data = curl_exec($ch);
      curl_close($ch);
      return $data;

    }

    $html = file_get_contents($url);
    echo $html;
?>
Run Code Online (Sandbox Code Playgroud)

谢谢

Pet*_*tai 8

您应该使用<base>为所有相对链接指定基本URL:

如果你卷曲http://example.com/thisPage.html然后base在回声输出''中添加一个标签.这应该在技术上<head>,但这将工作:

echo '<base href="http://example.com/" />';
echo $html;
Run Code Online (Sandbox Code Playgroud)

现场示例w没有<base>破坏<base>

  • 非常棒 - 比手动重写所有链接要好得多.这是将<base>放到正确位置的简单方法:`$ response = preg_replace("/ <head>/i","<head> <base href ='$ url'/>",$ response,1 );` (4认同)