如何从Python中的zip文件中读取zip文件?

Mic*_*son 33 python zip file unzip

我有一个我想要阅读的文件,它本身是在zip存档中压缩的.例如,parent.zip包含child.zip,其中包含child.txt.我在阅读child.zip时遇到了麻烦.谁能纠正我的代码?

我假设我需要创建一个类似文件的对象的child.zip,然后用第二个zipfile实例打开它,但是对于python我是新的zipfile.ZipFile(zfile.open(name))是愚蠢的.它引发了一个zipfile.BadZip文件:"文件不是一个zip文件"on(独立验证)child.zip

import zipfile
with zipfile.ZipFile("parent.zip", "r") as zfile:
    for name in zfile.namelist():
        if re.search(r'\.zip$', name) is not None:
            # We have a zip within a zip
            with **zipfile.ZipFile(zfile.open(name))** as zfile2:
                    for name2 in zfile2.namelist():
                        # Now we can extract
                        logging.info( "Found internal internal file: " + name2)
                        print "Processing code goes here"
Run Code Online (Sandbox Code Playgroud)

Mar*_*ers 45

当您.open()ZipFile实例上使用调用时,您确实获得了一个打开的文件句柄.但是,要读取 zip文件,ZipFile该类需要更多.它需要能够在该文件上进行搜索,并且返回的对象.open()不可搜索.

解决方法是读取整个拉链进入使用存储器ZipExFile,其存储在一个ZipFile对象(一个内存文件可搜索)和饲料,为.read():

from io import BytesIO

# ...
        zfiledata = BytesIO(zfile.read(name))
        with zipfile.ZipFile(zfiledata) as zfile2:
Run Code Online (Sandbox Code Playgroud)

或者,在您的示例中:

import zipfile
from io import BytesIO

with zipfile.ZipFile("parent.zip", "r") as zfile:
    for name in zfile.namelist():
        if re.search(r'\.zip$', name) is not None:
            # We have a zip within a zip
            zfiledata = BytesIO(zfile.read(name))
            with zipfile.ZipFile(zfiledata) as zfile2:
                for name2 in zfile2.namelist():
                    # Now we can extract
                    logging.info( "Found internal internal file: " + name2)
                    print "Processing code goes here"
Run Code Online (Sandbox Code Playgroud)


zlr*_*zlr 10

为了使这与python33一起工作(在windows下但可能不相关)我必须这样做:

 import zipfile, re, io
    with zipfile.ZipFile(file, 'r') as zfile:
        for name in zfile.namelist():
            if re.search(r'\.zip$', name) != None:
                zfiledata = io.BytesIO(zfile.read(name))
                with zipfile.ZipFile(zfiledata) as zfile2:
                    for name2 in zfile2.namelist():
                        print(name2)
Run Code Online (Sandbox Code Playgroud)

cStringIO不存在所以我使用了io.BytesIO