Gar*_*eld 2 python encoding utf-8 character-encoding
# -*- coding: utf-8 -*-
from pyquery import PyQuery as pq
from urllib import urlencode
from urllib2 import Request,urlopen
def sendRequest(url, data = None, headersOnly = False):
headers = { 'User-Agent' : 'Mozilla/5.0 (X11; U; Linux i686; en-US;)' }
request = Request(url, data, headers)
return urlopen(request).read()
resp = sendRequest("https://foursquare.com/v/rivers-edge-cafe-- morrison/4c1907776e02b7132eae627b")
print pq(resp)("#venueCategories").text()
Run Code Online (Sandbox Code Playgroud)
输出应该是Café,Burger Joint,Sandwich Place但是有例外:
Traceback (most recent call last):
File "unicodeerr1.py", line 11, in <module>
print pq(resp)("#venueCategories").text()
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)
您需要为源代码指定编码,或使用字符转义.
# -*- coding: utf-8 -*-
Run Code Online (Sandbox Code Playgroud)
要么
print 'caf\xc3\xa9' # UTF-8 representation of e accent egu.
Run Code Online (Sandbox Code Playgroud)
你可能想要使用Unicode文字(这里有一个unicode转义字符):
print u'caf\u00e9'
Run Code Online (Sandbox Code Playgroud)
请阅读Unicode HOWTO以完全理解这里发生的事情.其他有用的文件:
请注意,您的特定错误与python 2.5与2.7没有任何关系,但是所有输出编码都与您要打印的内容有关.在使用python2.5的服务器上,没有指定编码或显式设置为ASCII,但在使用python 2.7的本地计算机上,您最有可能处理支持UTF-8的终端.