Nev*_*DNZ 7 python string unicode character-encoding cjk
基本上我只是希望能够使用名为Bottle的类创建实例:例如class Bottle(object):...
,然后在另一个模块中能够简单地"打印"任何实例而不必破解代码来显式调用字符编码例程.
总之,当我尝试:
obj=Bottle(u"??")
print obj
Run Code Online (Sandbox Code Playgroud)
或者到"就地""打印":
print Bottle(u"??")
Run Code Online (Sandbox Code Playgroud)
我明白了:
"UnicodeEncodeError: 'ascii' codec can't encode characters"
Run Code Online (Sandbox Code Playgroud)
类似的stackoverflow问题:
¢目前切换到python3是不可行的.¢
关于如何进行utf-8打印的解决方案或提示(和解释)(就像U类在下面成功一样)将非常感激.:-)
ThanX N.
-
示例代码:
-------- 8> < - - - - 在这里切 - - - -
#!/usr/bin/env python
# -*- coding: utf-8 -*-
def setdefaultencoding(encoding="utf-8"):
import sys, codecs
org_encoding = sys.getdefaultencoding()
if org_encoding == "ascii": # not good enough
print "encoding set to "+encoding
sys.stdout = codecs.getwriter(encoding)(sys.stdout)
sys.stderr = codecs.getwriter(encoding)(sys.stderr)
setdefaultencoding()
msg=u"??" # the message!
class U(unicode): pass
m1=U(msg)
print "A)", m1 # works fine, even with unicode, but
class Bottle(object):
def __init__(self,msg): self.msg=msg
def __repr__(self):
print "debug: __repr__",self.msg
return '{{{'+self.msg+'}}}'
def __unicode__(self):
print "debug: __unicode__",self.msg
return '{{{'+self.msg+'}}}'
def __str__(self):
print "debug: __str__",self.msg
return '{{{'+self.msg+'}}}'
def decode(self,arg): print "debug: decode",self.msg
def encode(self,arg): print "debug: encode",self.msg
def translate(self,arg): print "debug: translate",self.msg
m2=Bottle(msg)
#print "B)", str(m2)
print "C) repr(x):", repr(m2)
print "D) unicode(x):", unicode(m2)
print "E)",m2 # gives: UnicodeEncodeError: 'ascii' codec can't encode characters
Run Code Online (Sandbox Code Playgroud)
-------- 8> < - - - - cut here - - - - Python 2.4输出:
encoding set to utf-8
A) ??
C) repr(x): debug: __repr__ ??
{{{\u5473\u7cbe}}}
D) unicode(x): debug: __unicode__ ??
{{{??}}}
E) debug: __str__ ??
Traceback (most recent call last):
File "./uc.py", line 43, in ?
print "E)",m2 # gives: UnicodeEncodeError: 'ascii' codec can't encode characters
UnicodeEncodeError: 'ascii' codec can't encode characters in position 3-4: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)
-------- 8> < - - - - cut here - - - - Python 2.6输出:
encoding set to utf-8
A) ??
C) repr(x): debug: __repr__ ??
Traceback (most recent call last):
File "./uc.py", line 41, in <module>
print "C) repr(x):", repr(m2)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 3-4: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)
如果你使用sys.stdout = codecs.getwriter(encoding)(sys.stdout)
那么你应该将Unicode字符串传递给print
:
>>> print u"%s" % Bottle(u"??????")
debug: __unicode__ ??????
{{{??????}}}
Run Code Online (Sandbox Code Playgroud)
正如@bobince在评论中指出的那样:避免sys.stdout
以这种方式改变,否则它可能会破坏任何可以使用的库代码sys.stdout
并且不希望打印Unicode字符串.
一般来说:
__unicode__()
应该返回Unicode字符串:
def __init__(self, msg, encoding='utf-8'):
if not isinstance(msg, unicode):
msg = msg.decode(encoding)
self.msg = msg
def __unicode__(self):
return u"{{{%s}}}" % self.msg
Run Code Online (Sandbox Code Playgroud)
__repr__()
应该返回ascii友好str
对象:
def __repr__(self):
return "Bottle(%r)" % self.msg
Run Code Online (Sandbox Code Playgroud)
__str__()
应该返回str
对象.添加可选项 encoding
以记录使用的编码.这里没有选择编码的好方法:
def __str__(self, encoding="utf-8")
return self.__unicode__().encode(encoding)
Run Code Online (Sandbox Code Playgroud)
定义write()
方法:
def write(self, file, encoding=None):
encoding = encoding or getattr(file, 'encoding', None)
s = unicode(self)
if encoding is not None:
s = s.encode(encoding)
return file.write(s)
Run Code Online (Sandbox Code Playgroud)
它应该涵盖文件具有自己的编码或直接支持Unicode字符串的情况.