Iva*_*cha 2 python unicode utf-8
我试着这样做,我发现了这个错误:
>>> import re
>>> x = 'Ingl\xeas'
>>> x
'Ingl\xeas'
>>> print x
Ingl?s
>>> x.decode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 4-5: unexpected end of data
>>> x.decode('utf8', 'ignore')
u'Ingl'
>>> x.decode('utf8', 'replace')
u'Ingl\ufffd'
>>> print x.decode('utf8', 'replace')
Ingl?
>>> print x.decode('utf8', 'xmlcharrefreplace')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
TypeError: don't know how to handle UnicodeDecodeError in error callback
Run Code Online (Sandbox Code Playgroud)
当我使用print语句时,我希望:
>>> print x
u'Inglês'
Run Code Online (Sandbox Code Playgroud)
欢迎任何帮助.
在解码输入数据之前,您需要知道输入数据的编码方式.在你们的一些尝试中,你试图从UTF-8解码它,但是Python抛出一个异常,因为输入是无效的UTF-8.看起来它可能是拉丁语-1.这对我有用:
>>> x = 'Ingl\xeas'
>>> print x.decode('latin1')
Inglês
Run Code Online (Sandbox Code Playgroud)
你提到"非ASCII HTML".如果您正在编写Web服务器脚本并且从HTTP请求获取数据,则应检查Content-Type标头.在理想的世界中,它会告诉您客户端使用哪种编码方式来处理数据.请记住,客户端可能工作不正常.
希望有所帮助!
| 归档时间: |
|
| 查看次数: |
9829 次 |
| 最近记录: |