如何在 Python 中将文本字符串编码为数字?

les*_*aul 4 python python-3.x

假设您有一个字符串:

mystring = "Welcome to the InterStar cafe, serving you since 2412!"
Run Code Online (Sandbox Code Playgroud)

我正在寻找一种将该字符串转换为数字的方法,例如:

encoded_string = number_encode(mystring)

print(encoded_string)

08713091353153848093820430298
Run Code Online (Sandbox Code Playgroud)

..您可以转换回原始字符串。

decoded_string = number_decode(encoded_string)

print(decoded_string)

"Welcome to the InterStar cafe, serving you since 2412!"
Run Code Online (Sandbox Code Playgroud)

它不必是加密安全的,但无论它在什么计算机上运行,​​它都必须为相同的字符串输出相同的数字。

Sha*_*ger 7

encode将其bytes转换为固定编码的 a ,然后将其转换bytesintwith int.from_bytes。反向操作是调用.to_bytes结果int,然后 decode返回到str

mystring = "Welcome to the InterStar cafe, serving you since 2412!"
mybytes = mystring.encode('utf-8')
myint = int.from_bytes(mybytes, 'little')
print(myint)
recoveredbytes = myint.to_bytes((myint.bit_length() + 7) // 8, 'little')
recoveredstring = recoveredbytes.decode('utf-8')
print(recoveredstring)
Run Code Online (Sandbox Code Playgroud)

在线试试吧!

这有一个缺陷,即如果字符串以NUL字符 ( '\0'/ \x00')结尾,您将丢失它们(切换到'big'字节顺序会从前面丢失它们)。如果这是一个问题,您总是可以'\x01'明确地填充并在解码端将其删除,这样就不会丢失尾随 0:

mystring = "Welcome to the InterStar cafe, serving you since 2412!"
mybytes = mystring.encode('utf-8') + b'\x01'  # Pad with 1 to preserve trailing zeroes
myint = int.from_bytes(mybytes, 'little')
print(myint)
recoveredbytes = myint.to_bytes((myint.bit_length() + 7) // 8, 'little')
recoveredstring = recoveredbytes[:-1].decode('utf-8') # Strip pad before decoding
print(recoveredstring)
Run Code Online (Sandbox Code Playgroud)