解码函数尝试编码Python

Question

解码函数尝试编码Python

Jim*_*ket 7 python unicode unicode-escapes

我正在尝试打印一个没有特定编码十六进制的unicode字符串.我从facebook获取这些数据,在UTF-8的html标题中有一个编码类型.当我打印类型 - 它说它的unicode,但是当我尝试使用unicode-escape解码它时说有一个编码错误.为什么我在使用解码方法时会尝试编码？

码

a='really long string of unicode html text that i wont reprint'
print type(a)
 >>> <type 'unicode'>   
print a.decode('unicode-escape')
 >>> Traceback (most recent call last):
  File "scfbp.py", line 203, in myFunctionPage
    print a.decode('unicode-escape')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 1945: ordinal not in range(128)

Run Code Online (Sandbox Code Playgroud)

Answer 1

Mar*_*ers 8

这不是解码失败.这是因为您正在尝试将结果显示到控制台.使用print时,它使用默认编码ASCII编码字符串.不要使用打印,它应该工作.

>>> a=u'really long string containing \\u20ac and some other text'
>>> type(a)
<type 'unicode'>
>>> a.decode('unicode-escape')
u'really long string containing \u20ac and some other text'
>>> print a.decode('unicode-escape')
Traceback (most recent call last):
  File "<stdin>", line 1, in 
UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 30: ordinal not in range(128)

我建议使用IDLE或其他可以输出unicode的解释器,那么你就不会遇到这个问题.

更新:请注意,这与减少反斜杠的情况不同,它在解码过程中失败,但具有相同的错误消息:

>>> a=u'really long string containing \u20ac and some other text'
>>> type(a)
<type 'unicode'>
>>> a.decode('unicode-escape')
Traceback (most recent call last):
  File "<stdin>", line 1, in 
UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 30: ordinal not in range(128)

归档时间：	15 年，3 月前
查看次数：	5385 次
最近记录：	15 年，3 月前