小编use*_*233的帖子

为什么即使使用编码我也无法在python中显示中文字符？

我只是想导入一个中文txt文件并打印出内容.这是我从网上复制的txt文件的内容,简体中文:http://stock.hexun.com/2013-06-01/154742801.html

起初,我尝试了这个:

userinput = raw_input('Enter the name of a file')
f=open(userinput,'r')
print f.read()
f.close()

Run Code Online (Sandbox Code Playgroud)

它可以打开文件并打印,但显示的内容是乱码.然后我用编码尝试了以下一个:

#coding=UTF-8
userinput = raw_input('Enter the name of a file')
import codecs
f= codecs.open(userinput,"r","UTF-8")
str1=f.read()
print str1
f.close()

Run Code Online (Sandbox Code Playgroud)

但是,它显示了一条错误消息.UnicodeEncodeError:'cp950编解码器无法在位置50编码字符u'\ u76d8':非法的mutibyte序列.

为什么会发生错误？怎么解决？我尝试过像Big5,cp950这样的其他unicode ......但它仍然无效.

python unicode encoding

use*_*233

lucky-day

4
推荐指数

2
解决办法

8831
查看次数

pyteaser中以下编码的含义是什么？

我目前正在使用pyteaser进行汇总,效果很好.我正在查看源代码,但即使借助下面的评论,我也不理解以下编码.任何人都可以解释一下吗？

def split_sentences(text):
    '''
    The regular expression matches all sentence ending punctuation and splits the string at those points.
    At this point in the code, the list looks like this ["Hello, world", "!" ... ]. The punctuation and all quotation marks
    are separated from the actual text. The first s_iter line turns each group of two items in the list into a tuple,
    excluding the last item in the list (the last item in the list does not need to …

Run Code Online (Sandbox Code Playgroud)

python

use*_*233

2014 01-07

-1
推荐指数

1
解决办法

374
查看次数