Python类中的unicode(self)和self .__ unicode __()之间的区别是什么?

spr*_*der 3 python unicode class

在处理unicode问题时,我发现unicode(self)并且self.__unicode__()有不同的行为:

#-*- coding:utf-8 -*-
import sys
import dis
class test():
    def __unicode__(self):
        s = u'??'
        return s.encode('utf-8')

    def __str__(self):
        return self.__unicode__()
print dis.dis(test)
a = test()
print a
Run Code Online (Sandbox Code Playgroud)

上面的代码工作正常,但如果我self.__unicode__()改为unicode(self),它将显示错误:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)

有问题的代码是:

#-*- coding:utf-8 -*-
import sys
import dis
class test():
    def __unicode__(self):
        s = u'??'
        return s.encode('utf-8')

    def __str__(self):
        return unicode(self)
print dis.dis(test)
a = test()
print a
Run Code Online (Sandbox Code Playgroud)

非常好奇python如何处理这个,我试过dis模块,但没有看到太多的区别:

Disassembly of __str__:
 12           0 LOAD_FAST                0 (self)
              3 LOAD_ATTR                0 (__unicode__)
              6 CALL_FUNCTION            0
              9 RETURN_VALUE   
Run Code Online (Sandbox Code Playgroud)

VS

Disassembly of __str__:
 10           0 LOAD_GLOBAL              0 (unicode)
              3 LOAD_FAST                0 (self)
              6 CALL_FUNCTION            1
              9 RETURN_VALUE       
Run Code Online (Sandbox Code Playgroud)

dav*_*v1d 5

bytes从你的__unicode__方法返回.

说清楚:

In [18]: class Test(object):
    def __unicode__(self):
        return u'äö?'.encode('utf-8')
    def __str__(self):
        return unicode(self)
   ....:     

In [19]: class Test2(object):
    def __unicode__(self):
        return u'äö?'
    def __str__(self):
        return unicode(self)
   ....:     

In [20]: t = Test()

In [21]: t.__str__()
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
/home/dav1d/<ipython-input-21-e2650f29e6ea> in <module>()
----> 1 t.__str__()

/home/dav1d/<ipython-input-18-8bc639cbc442> in __str__(self)
      3         return u'äö?'.encode('utf-8')
      4     def __str__(self):
----> 5         return unicode(self)
      6 

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

In [22]: unicode(t)
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
/home/dav1d/<ipython-input-22-716c041af66e> in <module>()
----> 1 unicode(t)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

In [23]: t2 = Test2()

In [24]: t2.__str__()
Out[24]: u'\xe4\xf6\u2193'

In [25]: str(_) # _ = last result
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
/home/dav1d/<ipython-input-25-3a1a0b74e31d> in <module>()
----> 1 str(_) # _ = last result

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)'

In [26]: unicode(t2)
Out[26]: u'\xe4\xf6\u2193'

In [27]: class Test3(object):
def __unicode__(self):
    return u'äö?'
def __str__(self):
    return unicode(self).encode('utf-8')
....:     

In [28]: t3 = Test3()

In [29]: t3.__unicode__()
Out[29]: u'\xe4\xf6\u2193'

In [30]: t3.__str__()
Out[30]: '\xc3\xa4\xc3\xb6\xe2\x86\x93'

In [31]: print t3
äö?

In [32]: print unicode(t3)
äö?
Run Code Online (Sandbox Code Playgroud)

print a或者在我的情况下print t将调用t.__str__哪些es预期返回bytes你让它返回unicode所以它试图编码它ascii不起作用.

轻松修复:让我们__unicode__返回unicode和__str__字节.