ali*_*rdi 5 python beautifulsoup web-scraping python-requests
我想从这个网页获取一些代理列表;https://free-proxy-list.net/ 但我陷入了这个错误,不知道如何解决。
requests.exceptions.ProxyError: HTTPSConnectionPool(host='free-proxy-list.net', port=443): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x00000278BFFA1EB0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected
party did not properly respond after a period of time, or established connection failed because connected host has failed to respond')))
Run Code Online (Sandbox Code Playgroud)
顺便说一句,这是我的相关代码:
import urllib
import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent
ua = UserAgent(cache=False)
header = {
"User-Agent": str(ua.msie)
}
proxy = {
"https": "http://95.66.151.101:8080"
}
urls = "https://free-proxy-list.net/"
res = requests.get(urls, proxies=proxy)
soup = BeautifulSoup(res.text,'lxml')
Run Code Online (Sandbox Code Playgroud)
我试图抓取其他网站,但我意识到它不是这样的。
小智 4
当您的代理是 http 代理时,您在 Json 字典中使用 https
代理应始终采用此格式
对于 http 代理
{'"http": "Http Proxy"}
Run Code Online (Sandbox Code Playgroud)
对于 https 代理
{"https":"Https Proxy"}
Run Code Online (Sandbox Code Playgroud)
对于 UserAgent
{"User-Agent": "Opera/9.80 (X11; Linux x86_64; U; de) Presto/2.2.15 Version/10.00"}
Run Code Online (Sandbox Code Playgroud)
例子
{'"http": "Http Proxy"}
Run Code Online (Sandbox Code Playgroud)
您导入的模块from fake_useragent import UserAgent不相关且不必要
额外的
该错误也可能是由于代理无效或响应不正确而发生的
如果您正在寻找免费的代理列表,请考虑查看这些来源
https://pastebin.com/raw/VJwVkqRT
https://proxyscrape.com/free-proxy-list
https://www.freeproxylists.net/
| 归档时间: |
|
| 查看次数: |
19173 次 |
| 最近记录: |