来自请求 Python 库的 HTTP 请求中缺少主机标头

Mag*_*ero 2 python http http-headers python-requests

Python 库生成的 HTTP 请求消息中的HTTP/1.1 强制 Host 头字段requests哪里?

import requests

response = requests.get("https://www.google.com/")
print(response.request.headers)
Run Code Online (Sandbox Code Playgroud)

输出这个:

{'User-Agent': 'python-requests/2.22.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}

Dee*_*ace 5

默认情况下HOST不会将标头添加到请求中requests。如果未明确添加,则将决策委托给底层http模块。

请参阅以下部分http/client.py

(如果'Host'标题明确提供requests.get然后skip_hostTrue

    if self._http_vsn == 11:
        # Issue some standard headers for better HTTP/1.1 compliance

        if not skip_host:
            # this header is issued *only* for HTTP/1.1
            # connections. more specifically, this means it is
            # only issued when the client uses the new
            # HTTPConnection() class. backwards-compat clients
            # will be using HTTP/1.0 and those clients may be
            # issuing this header themselves. we should NOT issue
            # it twice; some web servers (such as Apache) barf
            # when they see two Host: headers

            # If we need a non-standard port,include it in the
            # header.  If the request is going through a proxy,
            # but the host of the actual URL, not the host of the
            # proxy.

            netloc = ''
            if url.startswith('http'):
                nil, netloc, nil, nil, nil = urlsplit(url)

            if netloc:
                try:
                    netloc_enc = netloc.encode("ascii")
                except UnicodeEncodeError:
                    netloc_enc = netloc.encode("idna")
                self.putheader('Host', netloc_enc)
            else:
                if self._tunnel_host:
                    host = self._tunnel_host
                    port = self._tunnel_port
                else:
                    host = self.host
                    port = self.port

                try:
                    host_enc = host.encode("ascii")
                except UnicodeEncodeError:
                    host_enc = host.encode("idna")

                # As per RFC 273, IPv6 address should be wrapped with []
                # when used as Host header

                if host.find(':') >= 0:
                    host_enc = b'[' + host_enc + b']'

                if port == self.default_port:
                    self.putheader('Host', host_enc)
                else:
                    host_enc = host_enc.decode("ascii")
                    self.putheader('Host', "%s:%s" % (host_enc, port)) 
Run Code Online (Sandbox Code Playgroud)

因此,'Host'在检查requests发送到服务器的标头时,我们看不到标头。

如果我们向http://httpbin/get发送请求并打印响应,我们可以看到Host标题确实已发送。

import requests

response = requests.get("http://httpbin.org/get")
print('Response from httpbin/get')
print(response.json())
print()
print('response.request.headers')
print(response.request.headers)
Run Code Online (Sandbox Code Playgroud)

输出

Response from httpbin/get
{'args': {}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 
 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.20.0'},
 'origin': 'XXXXXX', 'url': 'https://httpbin.org/get'}

response.request.headers
{'User-Agent': 'python-requests/2.20.0', 'Accept-Encoding': 'gzip, deflate', 
 'Accept': '*/*', 'Connection': 'keep-alive'}
Run Code Online (Sandbox Code Playgroud)