kev*_*kev 7 python unicode string-formatting
我有三个UTF-8蜇伤:
hello, world
hello, ??
hello, ?rld
Run Code Online (Sandbox Code Playgroud)
我只想要前10个ascii-char-width,以便括号在一列中:
[hello, wor]
[hello, ? ]
[hello, ?r]
Run Code Online (Sandbox Code Playgroud)
在控制台中:
width('??')==width('worl')
width('? ')==width('wor') #a white space behind '?'
Run Code Online (Sandbox Code Playgroud)
一个中文字符是三个字节,但在控制台中显示时只有2个ascii字符宽度:
>>> bytes("hello, ??", encoding='utf-8')
b'hello, \xe4\xb8\x96\xe7\x95\x8c'
Run Code Online (Sandbox Code Playgroud)
format()当UTF-8字符混入时,python 没有帮助
>>> for s in ['[{0:<{1}.{1}}]'.format(s, 10) for s in ['hello, world', 'hello, ??', 'hello, ?rld']]:
... print(s)
...
[hello, wor]
[hello, ?? ]
[hello, ?rl]
Run Code Online (Sandbox Code Playgroud)
它不漂亮:
-----------Songs-----------
| 1: ?? |
| 2: ??? |
| 3: ?????? |
| 4: ????? |
| 5: ???(CUCURRUCUCU PALO|
| 6: ???? |
| 7: ?? |
| 8: ???? |
| 9: ????? |
| 10: ??( ?????????)(INTO |
| X 11: ???? |
| X 12: ????(THE MO RUN AIR |
| X 13: ???? |
| X 14: ?? |
| X 15: ??????(SERENADE) |
| X 16: ??????(Sweet Lullaby|
---------------------------
Run Code Online (Sandbox Code Playgroud)
所以,我想知道是否有一种标准的方法来做UTF-8填充工作人员?
Mar*_*nen 13
当尝试使用固定宽度字体的中文对齐ASCII文本时,有一组可打印ASCII字符的全宽版本.下面我制作了ASCII到全宽版本的转换表:
# coding: utf8
# full width versions (SPACE is non-contiguous with ! through ~)
SPACE = '\N{IDEOGRAPHIC SPACE}'
EXCLA = '\N{FULLWIDTH EXCLAMATION MARK}'
TILDE = '\N{FULLWIDTH TILDE}'
# strings of ASCII and full-width characters (same order)
west = ''.join(chr(i) for i in range(ord(' '),ord('~')))
east = SPACE + ''.join(chr(i) for i in range(ord(EXCLA),ord(TILDE)))
# build the translation table
full = str.maketrans(west,east)
data = '''\
??(A song)
???(Another song)
??????(Yet another song)
?????
???(Cucurrucucu palo whatever)
????
??
????
?????
?????????????(Into something)
????
????
????
??
??????(SERENADE)
??????(Sweet Lullaby)
'''
# Replace the ASCII characters with full width, and create a song list.
data = data.translate(full).rstrip().split('\n')
# translate each printable line.
print(' ----------Songs-----------'.translate(full))
for i,song in enumerate(data):
line = '|{:4}: {:20.20}|'.format(i+1,song)
print(line.translate(full))
print(' --------------------------'.translate(full))
Run Code Online (Sandbox Code Playgroud)
???????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
????????????????????????????
???????????????????????????
Run Code Online (Sandbox Code Playgroud)
它不是太漂亮,但它排成一行.
似乎没有官方支持,但内置软件包可能会有所帮助:
>>> import unicodedata
>>> print unicodedata.east_asian_width(u'?')
Run Code Online (Sandbox Code Playgroud)
This answer to a similar question提供了一个快速的解决方案。但是请注意,显示结果取决于所使用的确切等宽字体。ipython 和 pydev 使用的默认字体效果不佳,而 windows 控制台则可以。
| 归档时间: |
|
| 查看次数: |
2986 次 |
| 最近记录: |