如何使用\ u转义码编码Python 3字符串？

Question

在Python 3中,假设我有

>>> thai_string = '???'

使用encode给出

>>> thai_string.encode('utf-8')
b'\xe0\xb8\xaa\xe0\xb8\xb5'

我的问题:如何使用而不是使用encode()返回bytes序列？我怎么能回到Python 3 类型呢？\u\xdecodestr

我尝试使用ascii内置,这给了

>>> ascii(thai_string)
"'\\u0e2a\\u0e35'"

但这似乎不太正确,因为我无法解码它以获得thai_string.

文档说\u只用在字符串文字中,但我不确定这意味着什么.这是否暗示我的问题有一个有缺陷的前提？

Answer 1

你可以使用unicode_escape:

>>> thai_string.encode('unicode_escape')
b'\\u0e2a\\u0e35\\u0e40'

请注意,encode()将始终返回字节字符串(字节),unicode_escape编码旨在:

在Python源代码中生成一个适合作为Unicode文字的字符串