urllib2没有检索整个HTTP响应

got*_*nes 12 python http urllib2

我很困惑,为什么我无法使用urllib2FriendFeed下载一些JSON响应的全部内容.

>>> import urllib2
>>> stream = urllib2.urlopen('http://friendfeed.com/api/room/the-life-scientists/profile?format=json')
>>> stream.headers['content-length']
'168928'
>>> data = stream.read()
>>> len(data)
61058
>>> # We can see here that I did not retrieve the full JSON
... # given that the stream doesn't end with a closing }
... 
>>> data[-40:]
'ce2-003048343a40","name":"Vincent Racani'
Run Code Online (Sandbox Code Playgroud)

如何使用urllib2检索完整响应?

Jed*_*ith 18

获取所有数据的最佳方式:

fp = urllib2.urlopen("http://www.example.com/index.cfm")

response = ""
while 1:
    data = fp.read()
    if not data:         # This might need to be    if data == "":   -- can't remember
        break
    response += data

print response
Run Code Online (Sandbox Code Playgroud)

原因是.read(),鉴于套接字的性质,不能保证返回整个响应.我认为这在文档中讨论过(也许urllib),但我找不到它.

  • 我无法使用此示例来处理问题中提供的示例URL,http://friendfeed.com/api/room/the-life-scientists/profile?format = json.答复仍然不完整.正如我向John Weldon所提到的,重复调用`read()`只返回空字符串,而`read()`似乎是详尽无遗的. (2认同)