Python S3下载zip文件

Question

Python S3下载zip文件

我有zip文件上传到S3.我想下载它们进行处理.我不需要永久存储它们,但我需要暂时处理它们.我该怎么做呢？

Answer 1

因为工作软件>综合文档:

Boto2

import zipfile
import boto
import io

# Connect to s3
# This will need your s3 credentials to be set up 
# with `aws configure` using the aws CLI.
#
# See: https://aws.amazon.com/cli/
conn = boto.s3.connect_s3()

# get hold of the bucket
bucket = conn.get_bucket("my_bucket_name")

# Get hold of a given file
key = boto.s3.key.Key(bucket)
key.key = "my_s3_object_key"

# Create an in-memory bytes IO buffer
with io.BytesIO() as b:

    # Read the file into it
    key.get_file(b)

    # Reset the file pointer to the beginning
    b.seek(0)

    # Read the file as a zipfile and process the members
    with zipfile.ZipFile(b, mode='r') as zipf:
        for subfile in zipf.namelist():
            do_stuff_with_subfile()

Run Code Online (Sandbox Code Playgroud)

Boto3

import zipfile
import boto3
import io

# this is just to demo. real use should use the config 
# environment variables or config file.
#
# See: http://boto3.readthedocs.org/en/latest/guide/configuration.html

session = boto3.session.Session(
    aws_access_key_id="ACCESSKEY", 
    aws_secret_access_key="SECRETKEY"
)

s3 = session.resource("s3")
bucket = s3.Bucket('stackoverflow-brice-test')
obj = bucket.Object('smsspamcollection.zip')

with io.BytesIO(obj.get()["Body"].read()) as tf:

    # rewind the file
    tf.seek(0)

    # Read the file as a zipfile and process the members
    with zipfile.ZipFile(tf, mode='r') as zipf:
        for subfile in zipf.namelist():
            print(subfile)

Run Code Online (Sandbox Code Playgroud)

使用Python3在MacOSX上测试.

Answer 2

Dan*_*Gar 3

如果速度是一个问题，一个好的方法是选择一个距离您的 S3 存储桶（在同一区域）相当近的 EC2 实例，并使用该实例来解压缩/处理您的压缩文件。

这将减少延迟并允许您相当有效地处理它们。完成工作后，您可以删除每个提取的文件。

注意：只有当您可以正常使用 EC2 实例时，这才有效。

归档时间：	11 年，6 月前
查看次数：	11830 次
最近记录：	9 年，7 月前