Mat*_*ris 0 python sockets scripting freebase urllib2
我有一个数据密集型的Python脚本,它使用HTTP连接来下载数据.我通常一夜之间运行它.有时连接会失败,或者网站暂时无法使用.我有基本的错误处理,捕获这些异常并定期再次尝试,在重试5分钟后正常退出(并记录错误).
但是,我注意到有时这项工作就冻结了.不会抛出任何错误,并且作业仍在运行,有时在最后一条打印消息后几小时.
什么是最好的方式:
UPDATE
感谢大家的帮助.正如你们中的一些人所指出的那样,urllib和socket模块没有正确设置超时.我使用的Python 2.5与游离碱和urllib2的模块,并捕捉和处理MetawebErrors和urllib2.URLErrors.以下是最后一个脚本挂起12小时后的错误输出示例:
File "/home/matthew/dev/projects/myapp_module/project/app/myapp/contrib/freebase/api/session.py", line 369, in _httpreq_json
resp, body = self._httpreq(*args, **kws)
File "/home/matthew/dev/projects/myapp_module/project/app/myapp/contrib/freebase/api/session.py", line 355, in _httpreq
return self._http_request(url, method, body, headers)
File "/home/matthew/dev/projects/myapp_module/project/app/myapp/contrib/freebase/api/httpclients.py", line 33, in __call__
resp = self.opener.open(req)
File "/usr/lib/python2.5/urllib2.py", line 381, in open
response = self._open(req, data)
File "/usr/lib/python2.5/urllib2.py", line 399, in _open
'_open', req)
File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 1107, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
r = h.getresponse()
File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
response.begin()
File "/usr/lib/python2.5/httplib.py", line 385, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.5/httplib.py", line 343, in _read_status
line = self.fp.readline()
File "/usr/lib/python2.5/socket.py", line 372, in readline
data = recv(1)
KeyboardInterrupt
Run Code Online (Sandbox Code Playgroud)
您会注意到底部的套接字错误.由于我使用的是Python 2.5并且无法访问第三个urllib2.urlopen选项,是否有其他方法可以监视并捕获此错误?例如,我正在捕获URLErrrors - urllib2或socket中是否存在另一种类型的错误,我可以捕获哪些会对我有帮助?
| 归档时间: |
|
| 查看次数: |
3592 次 |
| 最近记录: |