Python TypeError:期望一个字符缓冲对象,个人误解

use*_*802 4 python unicode

我在很长一段时间内都遇到了这个错误:

 TypeError: expected a character buffer object
Run Code Online (Sandbox Code Playgroud)

我只是明白我误解了什么,这是unicode字符串和'简单'字符串之间的区别,我试图使用上面的代码与"普通"字符串,而我必须传递一个unicode.所以在字符串打破执行之前伪造简单的"u":/ !!!

BTW TypeError对我来说非常不清楚,现在仍然如此.

拜托,有人可以解释一下我错过了什么,为什么"简单"的字符串不是"字符缓冲对象"?

你可以用下面的代码重现(摘自及(c)在这里:)

def maketransU(s1, s2, todel=u""):
    """Build translation table for use with unicode.translate().

    :param s1: string of characters to replace.
    :type s1: unicode
    :param s2: string of replacement characters (same order as in s1).
    :type s2: unicode
    :param todel: string of characters to remove.
    :type todel: unicode
    :return: translation table with character code -> character code.
    :rtype: dict
    """
    # We go unicode internally - ensure callers are ok with that.
    assert (isinstance(s1,unicode))
    assert (isinstance(s2,unicode))
    trans_tab = dict( zip( map(ord, s1), map(ord, s2) ) )
    trans_tab.update( (ord(c),None) for c in todel )
    return trans_tab

#BlankToSpace_table = string.maketrans (u"\r\n\t\v\f",u"     ")
BlankToSpace_table = maketransU (u"\r\n\t\v\f",u"     ")
def BlankToSpace(text) :
    """Replace blanks characters by realspaces.

    May be good to prepare for regular expressions & Co based on whitespaces.

    :param  text: the text to clean from blanks.
    :type  text: string
    :return: List of parts in their apparition order.
    :rtype: [ string ]
    """
    print text, type(text), len(text)
    try:
        out =  text.translate(BlankToSpace_table)
    except TypeError, e:
        raise
    return out

# for SO : the code below is just to reproduce what i did not understand
dummy = "Hello,\n, this is a \t dummy test!"
for s in (unicode(dummy), dummy):
    print repr(s)
    print repr(BlankToSpace(s))
Run Code Online (Sandbox Code Playgroud)

生产:

u'Hello,\n, this is a \t dummy test!'
Hello,
, this is a      dummy test! <type 'unicode'> 32
u'Hello, , this is a   dummy test!'
'Hello,\n, this is a \t dummy test!'
Hello,
, this is a      dummy test! <type 'str'> 32

Traceback (most recent call last):
  File "C:/treetaggerwrapper.error.py", line 44, in <module>
    print repr(BlankToSpace(s))
  File "C:/treetaggerwrapper.error.py", line 36, in BlankToSpace
    out =  text.translate(BlankToSpace_table)
TypeError: expected a character buffer object
Run Code Online (Sandbox Code Playgroud)

Dan*_*man 12

问题是translatebytestring的translate方法与unicode字符串的方法不同.这是非unicode版本的docstring:

S.translate(table [,deletechars]) - > string

返回字符串S的副本,其中删除了可选参数deletechars中出现的所有字符,其余字符已通过给定的转换表进行映射,转换表必须是长度为256的字符串.

这是unicode版本:

S.translate(表) - > unicode

返回字符串S的副本,其中所有字符都已通过给定的转换表进行映射,该转换表必须是Unicode序数到Unicode序列,Unicode字符串或None的映射.未映射的字符保持不变.映射到"无"的字符将被删除.

您可以看到非unicode版本期望"长度为256的字符串",而非unicode版本期待"映射"(即字典).所以问题不在于你的unicode字符串是一个缓冲区对象而非unicode字符串不是 - 当然,两者都是缓冲区 - 但是一种translate方法期望这样的缓冲区对象而另一种方法不是.