如果重定向发生,如何在file_get_contents之后获取真实的URL？

Question

如果重定向发生,如何在file_get_contents之后获取真实的URL？

我正在使用file_get_contents()从网站获取内容,令人惊讶的是,即使我作为参数传递的URL重定向到另一个URL,它也能正常工作.

问题是我需要知道新的URL,有没有办法做到这一点？

Answer 1

如果您需要使用file_get_contents()而不是curl,请不要自动关注重定向:

$context = stream_context_create(
    array(
        'http' => array(
            'follow_location' => false
        )
    )
);

$html = file_get_contents('http://www.example.com/', false, $context);

var_dump($http_response_header);

Run Code Online (Sandbox Code Playgroud)

答案的灵感来源:如何在PHP中忽略带有file_get_contents的移动标头？

@PetrPeller这是一个PHP特殊变量:http://php.net/manual/en/reserved.variables.httpresponseheader.php (7认同)
你在哪里得到`$ http_response_header`？ (4认同)
我尝试了这个,虽然它根据本答案末尾链接的问题确实停止了重定向,但它没有提供此问题中所要求的"真实URL".也可能是我正在尝试使用的服务器不支持它.在我看来,虽然curl()方法是唯一可靠的方法. (2认同)
@RPorter 您需要提取 `$http_response_header` 中的 301 Location。 (2认同)

Answer 2

ale*_*lex 18

您可以使用cURL而不是使用cURL发出请求file_get_contents().

像这样的东西应该工作......

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$a = curl_exec($ch);
if(preg_match('#Location: (.*)#', $a, $r))
 $l = trim($r[1]);

Run Code Online (Sandbox Code Playgroud)

资源

@alex嘿......我想重点是他在询问file_get_contents(),所以当google搜索问题时,这就是你找到的. (2认同)

Answer 3

Ren*_*aud 17

一切功能:

function get_web_page( $url ) {
    $res = array();
    $options = array( 
        CURLOPT_RETURNTRANSFER => true,     // return web page 
        CURLOPT_HEADER         => false,    // do not return headers 
        CURLOPT_FOLLOWLOCATION => true,     // follow redirects 
        CURLOPT_USERAGENT      => "spider", // who am i 
        CURLOPT_AUTOREFERER    => true,     // set referer on redirect 
        CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect 
        CURLOPT_TIMEOUT        => 120,      // timeout on response 
        CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects 
    ); 
    $ch      = curl_init( $url ); 
    curl_setopt_array( $ch, $options ); 
    $content = curl_exec( $ch ); 
    $err     = curl_errno( $ch ); 
    $errmsg  = curl_error( $ch ); 
    $header  = curl_getinfo( $ch ); 
    curl_close( $ch ); 

    $res['content'] = $content;     
    $res['url'] = $header['url'];
    return $res; 
}  
print_r(get_web_page("http://www.example.com/redirectfrom"));

Run Code Online (Sandbox Code Playgroud)

Answer 4

Mar*_*ryl 6

使用裸的完整解决方案file_get_contents（注意输入输出$url参数）：

function get_url_contents_and_final_url(&$url)
{
    do
    {
        $context = stream_context_create(
            array(
                "http" => array(
                    "follow_location" => false,
                ),
            )
        );

        $result = file_get_contents($url, false, $context);

        $pattern = "/^Location:\s*(.*)$/i";
        $location_headers = preg_grep($pattern, $http_response_header);

        if (!empty($location_headers) &&
            preg_match($pattern, array_values($location_headers)[0], $matches))
        {
            $url = $matches[1];
            $repeat = true;
        }
        else
        {
            $repeat = false;
        }
    }
    while ($repeat);

    return $result;
}

Run Code Online (Sandbox Code Playgroud)

请注意，这仅适用于标头中的绝对 URL Location。如果您需要支持相对 URL，请参阅 PHP：如何解析相对 url。

例如，如果您使用@Joyce Babu 的答案中的解决方案，请替换：

            $url = $matches[1];

Run Code Online (Sandbox Code Playgroud)

和：

            $url = getAbsoluteURL($matches[1], $url);

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年，2 月前
查看次数：	41055 次
最近记录：	7 年，10 月前