使用textwrap.wrap的字节数

Question

使用textwrap.wrap的字节数

Val*_*ntz 5 python split word-wrap python-3.x python-unicode

如何textwrap在行达到一定字节数之前使用模块拆分（不拆分多字节字符）？

我想要这样的东西：

>>> textwrap.wrap('? ?? ?? ? ? ?? ??', bytewidth=10)
? ??
?? ?
? ??
??

Run Code Online (Sandbox Code Playgroud)

Answer 1

Val*_*ntz 1

我最终重写了一部分textwrap以在分割字符串后对单词进行编码。

与 Tom 的解决方案不同，Python 代码不需要迭代每个字符。

def byteTextWrap(text, size, break_long_words=True):
    """Similar to textwrap.wrap(), but considers the size of strings (in bytes)
    instead of their length (in characters)."""
    try:
        words = textwrap.TextWrapper()._split_chunks(text)
    except AttributeError: # Python 2
        words = textwrap.TextWrapper()._split(text)
    words.reverse() # use it as a stack
    if sys.version_info[0] >= 3:
        words = [w.encode() for w in words]
    lines = [b'']
    while words:
        word = words.pop(-1)
        if len(word) > size:
            words.append(word[size:])
            word = word[0:size]
        if len(lines[-1]) + len(word) <= size:
            lines[-1] += word
        else:
            lines.append(word)
    if sys.version_info[0] >= 3:
        return [l.decode() for l in lines]
    else:
        return lines

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，11 月前
查看次数：	262 次
最近记录：	6 年，8 月前