如何在python中保持对齐UTF-8编码的字符串？

Question

如何在python中保持对齐UTF-8编码的字符串？

Sum*_*Tea 1 python format encoding utf-8

我正试图将UTF-8编码的字符串对齐string.ljust.提出这个例外:UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128).例如,

s = u"??"    // a Chinese string
stdout.write(s.encode("UTF-8").ljust(20))

Run Code Online (Sandbox Code Playgroud)

我是在正确的轨道上吗？或者我应该使用其他方法进行格式化？

谢谢和最诚挚的问候.

Answer 1

Mar*_*nen 5

您是否发布了确切的代码和收到的确切错误？因为您的代码可以正常工作而不会在a cp437和utf-8终端上抛出错误.在任何情况下,您都应该在将Unicode字符串发送到终端之前对其进行调整.注意区别,因为UTF-8编码的中文在编码时长度为6而不是长度为2:

>>> sys.stdout.write(s.encode('utf-8').ljust(20) + "hello")
??              hello
>>> sys.stdout.write(s.ljust(20).encode('utf-8') + "hello")
??                  hello

Run Code Online (Sandbox Code Playgroud)

另请注意,中文字符比典型的固定宽度字体中的其他字符宽,因此如果混合语言,事情可能仍然没有按照您的意愿排列(请参阅此答案以获得解决方案):

>>> sys.stdout.write("12".ljust(20) + "hello")
12                  hello

Run Code Online (Sandbox Code Playgroud)

通常,您可以跳过显式编码stdout.Python在终端的编码中隐式地将Unicode字符串编码到终端(请参阅参考资料sys.stdout.encoding):

sys.stdout.write(s.ljust(20))

Run Code Online (Sandbox Code Playgroud)

另一种选择是使用print:

print "%20s" % s   # old-style

Run Code Online (Sandbox Code Playgroud)

要么:

print '{:20}'.format(s)  # new-style

Run Code Online (Sandbox Code Playgroud)

归档时间：	13 年，8 月前
查看次数：	1831 次
最近记录：	13 年，8 月前