jft*_*uga 2 python urllib python-3.x
import urllib.request
url="http://espn.com"
f = urllib.request.urlopen(url)
contents = f.read().decode('latin-1')
q = f.geturl()
print(q)
Run Code Online (Sandbox Code Playgroud)
此代码将返回http://espn.go.com/,这是我想要的 - 重定向网站URL.看完Python文档,谷歌搜索等,我无法弄清楚如何:
我怎么能在Python 3中做到这一点?如果有一个更好的模块urllib,我很好.
有是一个更好的模块,它被称为requests:
import requests
session = requests.Session()
session.headers['User-Agent'] = 'My-requests-agent/0.1'
resp = session.get(url)
contents = resp.text # If the server said it's latin 1, this'll be unicode (ready decoded)
print(resp.url) # final URL, after redirects.
Run Code Online (Sandbox Code Playgroud)
requests跟随重定向(检查resp.history重定向后面的内容).通过使用会话(可选),cookie被存储并传递给后续请求.您可以为每个请求或每个会话设置标头(因此,对于为该会话发送的每个请求,将发送相同的额外标头).
使用urllib(python3)的简单演示:
#!/usr/bin/env python3
#-*- coding:utf-8 -*-
import os.path
import urllib.request
from urllib.parse import urlencode
from http.cookiejar import CookieJar,MozillaCookieJar
cj = MozillaCookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
urllib.request.install_opener(opener)
cookie_file=os.path.abspath('./cookies.txt')
def load_cookies(cj,cookie_file):
cj.load(cookie_file)
def save_cookies(cj,cookie_file):
cj.save(cookie_file,ignore_discard=True,ignore_expires=True)
def dorequest(url,cj=None,data=None,timeout=10,encoding='UTF-8'):
data = urlencode(data).encode(encoding) if data else None
request = urllib.request.Request(url)
request.add_header('User-Agent','Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)')
f = urllib.request.urlopen(request,data,timeout=timeout)
return f.read()
def dopost(url,cj=None,data=None,timeout=10,encoding='UTF-8'):
body = dorequest(url,cj,data,timeout,encoding)
return body.decode(encoding)
Run Code Online (Sandbox Code Playgroud)
如果发生重定向,则应检查标题(30x).
| 归档时间: |
|
| 查看次数: |
4574 次 |
| 最近记录: |