Python请求库超时,但从浏览器获取响应

Mic*_*l N 4 python user-agent web-scraping python-requests

我正在尝试为nba数据创建一个web scrapper.当我运行以下代码时:

import requests

response = requests.get('https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=10%2F20%2F2017&DateTo=10%2F20%2F2017&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=')
Run Code Online (Sandbox Code Playgroud)

请求超时错误:

文件"C:\ ProgramData\Anaconda3\lib\site-packages\requests\api.py",第70行,获取返回请求('get',url,params = params,**kwargs)

文件"C:\ ProgramData\Anaconda3\lib\site-packages\requests\api.py",第56行,请求返回session.request(method = method,url = url,**kwargs)

文件"C:\ ProgramData\Anaconda3\lib\site-packages\requests\sessions.py",第488行,请求resp = self.send(prep,**send_kwargs)

文件"C:\ ProgramData\Anaconda3\lib\site-packages\requests\sessions.py",第609行,在send r = adapter.send(request,**kwargs)

文件"C:\ ProgramData\Anaconda3\lib\site-packages\requests\adapters.py",第473行,发送引发ConnectionError(错误,请求=请求)

ConnectionError :('Connection aborted.',OSError("(10060,'WSAETIMEDOUT')",))

但是,当我在浏览器中点击相同的URL时,我收到了响应.

Moi*_*dri 6

看起来您提到的网站正在"User-Agent"请求的标题中检查.你可以伪造"User-Agent"你的请求,使它看起来像来自实际的浏览器,你会收到响应.

例如:

>>> import requests
>>> url = "https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=10%2F20%2F2017&DateTo=10%2F20%2F2017&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight="
>>> headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}

>>> response = requests.get(url, headers=headers)
>>> response.status_code
200

>>> response.text  # will return the website content
Run Code Online (Sandbox Code Playgroud)

  • 繁荣!您可以在此处找到自己的“用户代理”值(https://www.whatismybrowser.com/detect/what-is-my-user-agent) (2认同)