Python - 从URL获取标题信息

Adi*_*dib 4 python python-3.x

我一直在寻找Python 3.x代码示例以获取HTTP头信息.

像Python中的get_headers一样简单的东西很容易在Python中找到.或者也许我不确定如何最好地包裹它.

从本质上讲,我想编写一些我可以看到URL是否存在的代码

一些东西

h = get_headers(url)
if(h[0] == 200)
{
   print("Bingo!")
}
Run Code Online (Sandbox Code Playgroud)

到目前为止,我试过了

h = http.client.HTTPResponse('http://docs.python.org/')
Run Code Online (Sandbox Code Playgroud)

但总是出错

Joh*_*web 11

要在获取HTTP响应代码,请使用以下urllib.request模块:

>>> import urllib.request
>>> response =  urllib.request.urlopen(url)
>>> response.getcode()
200
>>> if response.getcode() == 200:
...     print('Bingo')
... 
Bingo
Run Code Online (Sandbox Code Playgroud)

返回的HTTPResponseObject也允许您访问所有标头.例如:

>>> response.getheader('Server')
'Apache/2.2.16 (Debian)'
Run Code Online (Sandbox Code Playgroud)

如果调用urllib.request.urlopen()失败,则引发一个.您可以处理此问题以获取响应代码:HTTPError Exception

import urllib.request
try:
    response = urllib.request.urlopen(url)
    if response.getcode() == 200:
        print('Bingo')
    else:
        print('The response code was not 200, but: {}'.format(
            response.get_code()))
except urllib.error.HTTPError as e:
    print('''An error occurred: {}
The response code was {}'''.format(e, e.getcode()))
Run Code Online (Sandbox Code Playgroud)