ascii编解码器无法解码字节0xe9

Question

ascii编解码器无法解码字节0xe9

iqu*_*rio 5 python unicode encoding decode utf-8

我做了一些研究并看到了解决方案,但没有一个对我有用.

这不适合我.我知道0xe9是é角色.但我仍然无法弄清楚如何使这个工作,这是我的代码

output_lines = ['<menu>', '<day name="monday">', '<meal name="BREAKFAST">', '<counter name="Entreé">', '<dish>', '<name icon1="Vegan" icon2="Mindful Item">', 'Cream of Wheat (Farina)','</name>', '</dish>', '</counter >', '</meal >', '</day >', '</menu >']
output_string = '\n'.join([line.encode("utf-8") for line in output_lines])

Run Code Online (Sandbox Code Playgroud)

这给了我错误 ascii codec cant decode byte 0xe9

我试过解码,我试图取代"é"但似乎无法让它工作.

Answer 1

Mar*_*ers 5

您正在尝试编码字节串:

>>> '<counter name="Entreé">'.encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 20: ordinal not in range(128)

Run Code Online (Sandbox Code Playgroud)

Python是想尽力帮忙,你只能编码的Unicode字符串字节,所以编码Python的第一implictly 解码,使用默认的编码.

解决方案是不对已经编码的数据进行编码,或者在尝试再次编码之前首先使用合适的编解码器进行解码,如果数据被编码为与您需要的编解码不同的编解码器.

如果你有unicode和bytestring值的混合,只需解码字节串或只编码unicode值; 尽量避免混合类型.以下将字节字符串解码为unicode:

def ensure_unicode(v):
    if isinstance(v, str):
        v = v.decode('utf8')
    return unicode(v)  # convert anything not a string to unicode too

output_string = u'\n'.join([ensure_unicode(line) for line in output_lines])

Run Code Online (Sandbox Code Playgroud)

归档时间：	10 年，10 月前
查看次数：	27863 次
最近记录：	10 年，10 月前