Mun*_*kin 6 python zip python-unicode
我有一个小脚本,它将提取.zip文件.这很好用,但仅适用于.zip文件,它们的文件名中不包含带有"ä","ö","ü"(等等)字母的文件.否则我收到此错误:
Exception in thread Thread-1:
Traceback (most recent call last):
File "threading.pyc", line 552, in __bootstrap_inner
File "install.py", line 92, in run
File "zipfile.pyc", line 962, in extractall
File "zipfile.pyc", line 950, in extract
File "zipfile.pyc", line 979, in _extract_member
File "ntpath.pyc", line 108, in join
UnicodeDecodeError: 'ascii' codec can't decode byte 0x94 in position 32: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)
这是我的脚本的提取部分:
zip = zipfile.ZipFile(path1)
zip.extractall(path2)
Run Code Online (Sandbox Code Playgroud)
我怎么解决这个问题?
一个建议:
我这样做时收到错误:
>>> c = chr(129)
>>> c + u'2'
Traceback (most recent call last):
File "<pyshell#21>", line 1, in <module>
c + u'2'
UnicodeDecodeError: 'ascii' codec can't decode byte 0x81 in position 0: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)
有一个unicode字符串传递到某处加入.
可能是zipfile的文件路径是用unicode编码的吗?如果你这样做怎么办:
zip = zipfile.ZipFile(str(path1))
zip.extractall(str(path2))
Run Code Online (Sandbox Code Playgroud)
或这个:
zip = zipfile.ZipFile(unicode(path1))
zip.extractall(unicode(path2))
Run Code Online (Sandbox Code Playgroud)
这是ntpath中的第128行:
def join(a, *p): # 63
for b in p: # 68
path += "\\" + b # 128
Run Code Online (Sandbox Code Playgroud)
第二个建议:
from ntpath import *
def join(a, *p):
"""Join two or more pathname components, inserting "\\" as needed.
If any component is an absolute path, all previous path components
will be discarded."""
path = a
for b in p:
b_wins = 0 # set to 1 iff b makes path irrelevant
if path == "":
b_wins = 1
elif isabs(b):
# This probably wipes out path so far. However, it's more
# complicated if path begins with a drive letter:
# 1. join('c:', '/a') == 'c:/a'
# 2. join('c:/', '/a') == 'c:/a'
# But
# 3. join('c:/a', '/b') == '/b'
# 4. join('c:', 'd:/') = 'd:/'
# 5. join('c:/', 'd:/') = 'd:/'
if path[1:2] != ":" or b[1:2] == ":":
# Path doesn't start with a drive letter, or cases 4 and 5.
b_wins = 1
# Else path has a drive letter, and b doesn't but is absolute.
elif len(path) > 3 or (len(path) == 3 and
path[-1] not in "/\\"):
# case 3
b_wins = 1
if b_wins:
path = b
else:
# Join, and ensure there's a separator.
assert len(path) > 0
if path[-1] in "/\\":
if b and b[0] in "/\\":
path += b[1:]
else:
path += b
elif path[-1] == ":":
path += b
elif b:
if b[0] in "/\\":
path += b
else:
# !!! modify the next line so it works !!!
path += "\\" + b
else:
# path is not empty and does not end with a backslash,
# but b is empty; since, e.g., split('a/') produces
# ('a', ''), it's best if join() adds a backslash in
# this case.
path += '\\'
return path
import ntpath
ntpath.join = join
Run Code Online (Sandbox Code Playgroud)