我还有一个错误要修复.
row = OpenThisLink + titleTag + JD
try:
csvwriter.writerow([row])
except (UnicodeEncodeError, UnicodeDecodeError):
pass
Run Code Online (Sandbox Code Playgroud)
这给出了错误(对于这个字符:"ń")
row = OpenThisLink + str(titleTag) + JD
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 51: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)
>>> title = "hello Gilici?ski"
Unsupported characters in input
u = unicode(title, "latin1")
Traceback (most recent call last):
File "<pyshell#56>", line 1, in <module>
u = unicode(title, "latin1")
NameError: name 'title' is not defined
>>> title = "?" Unsupported characters in input
Run Code Online (Sandbox Code Playgroud)
根据文件:
与类似的情况不同
UnicodeEncodeError,不能总是避免这种失败.
事实上,我的例外似乎不起作用.有什么建议?
谢谢!
事实上,我的例外似乎不起作用.有什么建议?
row = OpenThisLink + titleTag + JD在try/except块之外,因此在该语句运行时引发的任何异常都不会被捕获.但是,这将捕获异常:
try:
row = OpenThisLink + titleTag + JD
csvwriter.writerow([row])
except (UnicodeEncodeError, UnicodeDecodeError):
print "Caught unicode error"
Run Code Online (Sandbox Code Playgroud)
但是,在您发布的代码中,row = OpenThisLink + titleTag + JD如果titleTag包含unicode字符串,则不会引发UnicodeEncodeError ; 字符串连接的结果将是unicode类型.
现在,csv模块不支持unicode,因此当您writerow()使用unicode数据调用时,这将引发UnicodeEncodeError.您需要将您的unicode字符串编码为合适的编码(UTF8最好),然后将其传递给writerow(),例如:
>>> titleTag = "hello Gilici?ski"
>>> titleTag
'hello Gilici\xc5\x84ski'
>>> type(titleTag)
<type 'str'>
>>>
>>> titleTag = titleTag.decode('utf8')
>>> titleTag
u'hello Gilici\u0144ski'
>>> type(titleTag)
<type 'unicode'>
>>>
>>> csvwriter.writerow([titleTag])
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0144' in position 12: ordinal not in range(128)
>>>
>>> # but this will work...
>>> csvwriter.writerow([titleTag.encode('utf8')])
Run Code Online (Sandbox Code Playgroud)
相关的Python文档就在这里.请务必查看示例,尤其是最后一个示例.
BTW,pyshell似乎不接受非ascii字符作为输入,所以使用普通的Python解释器.