如何将 unicode 转换为 unicode 转义文本

Question

如何将 unicode 转换为 unicode 转义文本

我正在加载一个包含一堆 unicode 字符的文件（例如\xe9\x87\x8b）。我想在 Python 中将这些字符转换为其转义 unicode 形式 ( \u91cb)。我在 StackOverflow 上发现了几个类似的问题，包括Evaluate UTF-8 Literal escape strings in a string in Python3，它几乎完全符合我的要求，但我不知道如何保存数据。

例如：输入文件：

\xe9\x87\x8b

Python脚本

file = open("input.txt", "r")
text = file.read()
file.close()
encoded = text.encode().decode('unicode-escape').encode('latin1').decode('utf-8')
file = open("output.txt", "w")
file.write(encoded) # fails with a unicode exception
file.close()

Run Code Online (Sandbox Code Playgroud)

输出文件（我想要的）：

\u91cb

Answer 1

fal*_*tru 5

需要用unicode-escapeencoding再次编码。

\n\n

>>> br\'\\xe9\\x87\\x8b\'.decode(\'unicode-escape\').encode(\'latin1\').decode(\'utf-8\')\n\'\xe9\x87\x8b\'\n>>> _.encode(\'unicode-escape\')\nb\'\\\\u91cb\'\n

Run Code Online (Sandbox Code Playgroud)\n\n

\n\n

代码修改（使用二进制模式以减少不必要的编码/解码）

\n\n

with open("input.txt", "rb") as f:\n    text = f.read().rstrip()  # rstrip to remove trailing spaces\ndecoded = text.decode(\'unicode-escape\').encode(\'latin1\').decode(\'utf-8\')\nwith open("output.txt", "wb") as f:\n    f.write(decoded.encode(\'unicode-escape\'))\n

Run Code Online (Sandbox Code Playgroud)\n\n

http://asciinema.org/a/797ruy4u5gd1vsv8pplzlb6kq

\n

归档时间：	10 年，5 月前
查看次数：	6804 次
最近记录：	10 年，5 月前