mab*_*n69 5 python python-3.x python-unicode
第一篇帖子请好心的,我已经搜索了很多,但我找到的大部分内容都与Python 2有关.
我有一个Python3脚本,可以从文件列表中构建一个zip文件; 仅当脚本从crontab运行时,它才会失败并显示UnicodeEncodeError,但是当它从交互式控制台运行时它可以正常运行.我想环境中一定有东西,但我似乎无法弄清楚是什么.
这是代码摘录:
def zipFileList(self, rootfolder, filelist, zip_file, logger):
count = 0
logger.info("Generazione file zip {0}: da {1} files".format(zip_file, len(filelist)))
zip = zipfile.ZipFile(zip_file, "w", compression=zipfile.ZIP_DEFLATED)
for curfile in filelist:
zip.write(os.path.join(rootfolder, curfile), curfile, zipfile.ZIP_DEFLATED)
count = count + 1
zip.close()
logger.info("Scrittura terminata: {0} files".format(count))
Run Code Online (Sandbox Code Playgroud)
这是此代码片段的日志输出:
2012-07-31 09:10:03,033: root - ERROR - Traceback (most recent call last):
File "/usr/local/lib/python3.2/zipfile.py", line 365, in _encodeFilenameFlags
return self.filename.encode('ascii'), self.flag_bits
UnicodeEncodeError: 'ascii' codec can't encode characters in position 56-57: ordinal not in range(128)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "XBE.py", line 45, in main
pam.executeList(logger)
File "/home/vte/vtebackup/vte41/scripts/ptActivityManager.py", line 62, in executeList
self.executeActivity(act, logger)
File "/home/vte/vtebackup/vte41/scripts/ptActivityManager.py", line 71, in executeActivity
self.exAct_FileBackup(act, logger)
File "/home/vte/vtebackup/vte41/scripts/ptActivityManager.py", line 112, in exAct_FileBackup
ptfs.zipFileList(srcfolder, filelist, arcfilename, logger)
File "/home/vte/vtebackup/vte41/scripts/ptFileManager.py", line 143, in zipFileList
zip.write(os.path.join(rootfolder, curfile), curfile, zipfile.ZIP_DEFLATED)
File "/usr/local/lib/python3.2/zipfile.py", line 1115, in write
self.fp.write(zinfo.FileHeader())
File "/usr/local/lib/python3.2/zipfile.py", line 355, in FileHeader
filename, flag_bits = self._encodeFilenameFlags()
File "/usr/local/lib/python3.2/zipfile.py", line 367, in _encodeFilenameFlags
return self.filename.encode('utf-8'), self.flag_bits | 0x800
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 56: surrogates not allowed
Run Code Online (Sandbox Code Playgroud)
这是crontab行:
10 9 * * * /home/vte/vtebackup/vte41/scripts/runbackup.sh >/dev/null 2>&1
Run Code Online (Sandbox Code Playgroud)
这是runbackup.sh的内容:
#! /bin/bash -l
cd /home/vte/vtebackup/vte41/scripts
/usr/local/bin/python3.2 XBE.py
Run Code Online (Sandbox Code Playgroud)
发生异常的文件总是相同的,但它似乎不包含任何非ascii字符:
/var/vhosts/vte41/http_docs/vtecrm41/storage/2012/July/week4/169933_Puccini_Gabriele.tif
Run Code Online (Sandbox Code Playgroud)
操作系统是Ubuntu Linux LTS 10.04,Python版本3.2(与其他Python版本并行安装).所有Python源文件都有这个shebang
#!/usr/bin/env python3.2
Run Code Online (Sandbox Code Playgroud)
作为第一线
你能帮我找出什么问题以及如何解决这个问题吗?
mab*_*n69 17
团队成员在Python bug线程中找到了解决方案.
通过将LANG指令添加到脚本命令来解决该问题:
* * * * * LANG=it_IT.UTF-8 /home/vte/vtebackup/vte41/scripts/runbackup.sh >/dev/null 2>&1
Run Code Online (Sandbox Code Playgroud)
我希望这对其他人有用,因为我让自己摸了一会儿这个:)
检查您的语言环境。在交互式控制台上,运行命令locale。这是我得到的:
LANG=
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"
Run Code Online (Sandbox Code Playgroud)
Python 根据LC_CTYPE或LANG环境变量确定如何解释文件名,我强烈怀疑其中之一在您的 cron 环境中设置为不同的编码。
如果是这种情况,您的文件名将使用不同的编码解码为 unicode,然后导致文件名无法编码为 UTF-8 或 ASCII。
只需LC_CTYPE在您的 cron 定义中设置变量,或者在时间条目之前的一行中,或者作为要执行的命令的一部分:
LC_CTYPE="en_US.UTF-8"
* * * * * yourscriptcommand.py
Run Code Online (Sandbox Code Playgroud)
与 python Unicode 问题一样,答案在于Unicode HOWTO,文件名部分。