小编Dav*_* IV的帖子

用Python捕捉ConnectionResetError

我正在构建一个Python脚本,该脚本在数据库中搜索所有URL,然后按照这些URL查找断开的链接。此脚本需要在打开链接时遇到错误时使用异常处理进行记录,但是它开始遇到一个错误,我完全无法为以下命令编写except语句:

Traceback (most recent call last):
  File "exceptionerror.py", line 97, in <module>
    raw_response = response.read().decode('utf8', errors='ignore')
  File "/usr/lib/python3.4/http/client.py", line 512, in read
    s = self._safe_read(self.length)
  File "/usr/lib/python3.4/http/client.py", line 662, in _safe_read
    chunk = self.fp.read(min(amt, MAXAMOUNT))
  File "/usr/lib/python3.4/socket.py", line 371, in readinto
    return self._sock.recv_into(b)
ConnectionResetError: [Errno 104] Connection reset by peer
Run Code Online (Sandbox Code Playgroud)

我尝试了以下方法:

except SocketError as inst:
    brokenlinksflag = 1
    brokenlinks = articlelinks[j] + ' ' + sys.exc_info()[0] + ', ' + brokenlinks
    continue
Run Code Online (Sandbox Code Playgroud)

和:

except ConnectionResetError as inst:
    brokenlinksflag = 1 …
Run Code Online (Sandbox Code Playgroud)

python exception-handling

6
推荐指数
1
解决办法
7262
查看次数

urlopen返回有效链接的重定向错误

我正在python中构建一个断开的链接检查器,它正在成为一个苦差事,构建逻辑,用于正确识别使用浏览器访问时无法解析的链接.我找到了一组链接,我可以用我的刮刀一致地重现重定向错误,但在浏览器中访问时它会完美地解析.我希望我能在这里找到一些见解.

import urllib
import urllib.request
import html.parser
import requests
from requests.exceptions import HTTPError
from socket import error as SocketError

try:
    req=urllib.request.Request(url, None, {'User-Agent': 'Mozilla/5.0 (X11; Linux i686; G518Rco3Yp0uLV40Lcc9hAzC1BOROTJADjicLjOmlr4=) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36','Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8','Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3','Accept-Encoding': 'gzip, deflate, sdch','Accept-Language': 'en-US,en;q=0.8','Connection': 'keep-alive'})
    response = urllib.request.urlopen(req)
    raw_response = response.read().decode('utf8', errors='ignore')
    response.close()
except urllib.request.HTTPError as inst:
    output = format(inst)


print(output)
Run Code Online (Sandbox Code Playgroud)

在这种情况下,可靠地返回此错误的URL示例是" http://forums.hostgator.com/want-see-your-sites-dns-propagating-t48838.html ".它在访问时完美解析,但上面的代码将返回以下错误:

HTTP Error 301: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x …
Run Code Online (Sandbox Code Playgroud)

urllib httprequest python-3.x

2
推荐指数
1
解决办法
5032
查看次数