为什么 get_headers() 返回 400 Bad request，而 CLI curl 返回 200 OK？

Question

为什么 get_headers() 返回 400 Bad request，而 CLI curl 返回 200 OK？

我正在尝试使用本机函数获取 HTTP 标头get_headers()：

$headers = get_headers('https://www.grammarly.com')

Run Code Online (Sandbox Code Playgroud)

结果是

HTTP/1.1 400 Bad Request
Date: Fri, 27 Apr 2018 12:32:34 GMT
Content-Type: text/plain; charset=UTF-8
Content-Length: 52
Connection: close

Run Code Online (Sandbox Code Playgroud)

但是，如果我使用命令行工具执行相同的操作curl，结果会有所不同：

curl -sI https://www.grammarly.com/

HTTP/1.1 200 OK
Date: Fri, 27 Apr 2018 12:54:47 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 25130
Connection: keep-alive

Run Code Online (Sandbox Code Playgroud)

造成这种反应差异的原因是什么？这是 Grammarly 的服务器端某种实施不力的安全功能还是其他什么？

Answer 1

Ant*_*ony 5

这是因为get_headers()使用默认的流上下文，这基本上意味着几乎没有 HTTP 标头发送到 URL，而大多数远程服务器对此会很挑剔。通常，最有可能导致问题的缺失标头是用户代理。您可以在调用之前手动设置get_headers()它stream_context_set_default。这是一个对我有用的例子：

$headers = get_headers('https://www.grammarly.com');

print_r($headers);

// has [0] => HTTP/1.1 400 Bad Request

stream_context_set_default(
    array(
        'http' => array(
            'user_agent'=>"php/testing"
        ),
    )
);

$headers = get_headers('https://www.grammarly.com');

print_r($headers);

// has [0] => HTTP/1.1 200 OK

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，8 月前
查看次数：	1769 次
最近记录：	7 年，8 月前