twitter trend api UnicodeDecodeError:'utf8'编解码器无法解码位置1的字节0x8b:意外的代码字节

Question

twitter trend api UnicodeDecodeError:'utf8'编解码器无法解码位置1的字节0x8b:意外的代码字节

我试图按照"挖掘社交网络"一书的示例代码,1-3.

我知道它的旧版本,所以我按照网页上的新示例输入链接描述

但是,有时候,当我实现代码时,我会遇到错误信息:

[ trend.decode('utf-8') for trend in world_trends()[0]['trends'] ]

Run Code Online (Sandbox Code Playgroud)

错误信息是这样的:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "build/bdist.macosx-10.6-universal/egg/twitter/api.py", line 167, in __call__
File "build/bdist.macosx-10.6-universal/egg/twitter/api.py", line 173, in _handle_response
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x8b in position 1: unexpected code byte

Run Code Online (Sandbox Code Playgroud)

它并不总是发生,但我认为没有程序员喜欢这种"随机"的情况.

所以有人可以帮我解决这个问题吗？问题是什么以及如何解决这个问题？

非常感谢〜

Answer 1

sma*_*rgh 24

byte 0x8b in position 1通常表示数据流是gzip压缩的.对于类似的问题,请看这里和这里.

要解压缩数据流:

buf = StringIO.StringIO(<response object>.content)
gzip_f = gzip.GzipFile(fileobj=buf)
content = gzip_f.read()

Run Code Online (Sandbox Code Playgroud)

Answer 2

dav*_*fg4 1

默认情况下，如果解码（）遇到不知道如何解码的字节，则会抛出错误。

您可以使用trend.decode('utf-8', 'replace')或trend.decode('utf-8', 'ignore')来不抛出错误并默默地忽略它。

有关decode() 的文档请参见此处。

归档时间：	13 年，7 月前
查看次数：	10213 次
最近记录：	11 年，1 月前