如何使用请求库从http请求中获取IP地址?

gaw*_*wry 19 python pycurl httplib2 httplib python-requests

我正在使用python中的请求库发出HTTP请求,但是我需要来自响应http请求的服务器的ip地址,并且我试图避免进行两次调用(并且可能有一个不同的ip地址来自响应的那个请求.

那可能吗?有没有python http库允许我这样做?

ps:我还需要发出HTTPS请求并使用经过身份验证的代理.

更新1:

例:

import requests

proxies = {
  "http": "http://user:password@10.10.1.10:3128",
  "https": "http://user:password@10.10.1.10:1080",
}

response = requests.get("http://example.org", proxies=proxies)
response.ip # This doesn't exist, this is just an what I would like to do
Run Code Online (Sandbox Code Playgroud)

那么,我想知道从响应中的方法或属性连接的IP地址请求.在其他库中,我能够通过找到sock对象并使用getpeername()方法来做到这一点.

Mat*_*ttH 33

事实证明它相当复杂.

使用requests版本1.2.3 时,这是一个猴子补丁:

缠绕_make_request方法上HTTPConnectionPool存储从响应socket.getpeername()HTTPResponse实例.

对于我在python 2.7.3上,这个实例可用response.raw._original_response.

from requests.packages.urllib3.connectionpool import HTTPConnectionPool

def _make_request(self,conn,method,url,**kwargs):
    response = self._old_make_request(conn,method,url,**kwargs)
    sock = getattr(conn,'sock',False)
    if sock:
        setattr(response,'peer',sock.getpeername())
    else:
        setattr(response,'peer',None)
    return response

HTTPConnectionPool._old_make_request = HTTPConnectionPool._make_request
HTTPConnectionPool._make_request = _make_request

import requests

r = requests.get('http://www.google.com')
print r.raw._original_response.peer
Run Code Online (Sandbox Code Playgroud)

产量:

('2a00:1450:4009:809::1017', 80, 0, 0)
Run Code Online (Sandbox Code Playgroud)

啊,如果涉及代理或响应被分块,HTTPConnectionPool._make_request则不会被调用.

所以这是一个新版本修补httplib.getresponse:

import httplib

def getresponse(self,*args,**kwargs):
    response = self._old_getresponse(*args,**kwargs)
    if self.sock:
        response.peer = self.sock.getpeername()
    else:
        response.peer = None
    return response


httplib.HTTPConnection._old_getresponse = httplib.HTTPConnection.getresponse
httplib.HTTPConnection.getresponse = getresponse

import requests

def check_peer(resp):
    orig_resp = resp.raw._original_response
    if hasattr(orig_resp,'peer'):
        return getattr(orig_resp,'peer')
Run Code Online (Sandbox Code Playgroud)

运行:

>>> r1 = requests.get('http://www.google.com')
>>> check_peer(r1)
('2a00:1450:4009:808::101f', 80, 0, 0)
>>> r2 = requests.get('https://www.google.com')
>>> check_peer(r2)
('2a00:1450:4009:808::101f', 443, 0, 0)
>>> r3 = requests.get('http://wheezyweb.readthedocs.org/en/latest/tutorial.html#what-you-ll-build')
>>> check_peer(r3)
('162.209.99.68', 80)
Run Code Online (Sandbox Code Playgroud)

还检查了运行时是否设置了代理; 代理地址被退回.


更新 2016/01/19

est提供了一种不需要猴子补丁的替代品:

rsp = requests.get('http://google.com', stream=True)
# grab the IP while you can, before you consume the body!!!!!!!!
print rsp.raw._fp.fp._sock.getpeername()
# consume the body, which calls the read(), after that fileno is no longer available.
print rsp.content  
Run Code Online (Sandbox Code Playgroud)

更新 2016/05/19

根据评论,复制到这里以获得可见性,Richard Kenneth Niescior提供了以下确认使用请求2.10.0和Python 3.

rsp=requests.get(..., stream=True)
rsp.raw._connection.sock.getpeername()
Run Code Online (Sandbox Code Playgroud)

更新 2019/02/22

请求版本为2.19.1的Python3.

rsp=requests.get(..., stream=True)
resp.raw._connection.sock.socket.getsockname()
Run Code Online (Sandbox Code Playgroud)

  • 现在似乎是“resp.raw._connection.sock.socket.getsockname()”,请求版本为“2.19.1”。请注意,在我的例子中,`resp.raw._connection.sock` 的类型是 `urllib3.contrib.pyopenssl.WrappedSocket`。 (4认同)
  • @MattH `getsockname()` 是本地名称,`getpeername()` 是远程名称,您将在回复中将两者与更新混合在一起。 (4认同)
  • 我刚刚需要这个,我发现`rsp = requests.get(...,stream = True); rsp.raw._connection.sock.getpeername()`有效.我有`请求2.10.0`并使用Python 3 (3认同)