如何在 python 中解码这个字符串？

Question

如何在 python 中解码这个字符串？

我下载了一个 facebook 消息数据集，它的格式如下：

f\u00c3\u00b8rste student

Run Code Online (Sandbox Code Playgroud)

它应该是，første student但我似乎无法正确解码它。

我试过：

str = 'f\u00c3\u00b8rste student'
print(str)
# 'fÃ¸rste student'

str = 'f\u00c3\u00b8rste student'
print(str.encode('utf-8')) 
# b'f\xc3\x83\xc2\xb8rste student'

Run Code Online (Sandbox Code Playgroud)

但它没有用。

Answer 1

jwo*_*der 8

要撤消已发生的任何编码错误，您首先需要通过在 ISO-8859-1 (Latin-1) 中编码，然后解码为 UTF-8，将字符转换为具有相同序数的字节：

>>> 'f\u00c3\u00b8rste student'.encode('iso-8859-1').decode('utf-8')
'første student'

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，3 月前
查看次数：	3697 次
最近记录：	7 年，3 月前