AWS S3 图像保存丢失元数据

Question

AWS S3 图像保存丢失元数据

Sea*_*ley 2 python amazon-s3 python-imaging-library

我正在使用用 python 2.7x 编写的 AWS Lambda 函数，该函数下载、保存到 /tmp ，然后将图像文件上传回存储桶。

我的图像元数据从原始存储桶开始，带有 http 标头，如 Content-Type= image/jpeg 等。

使用 PIL 保存图像后，所有标题都消失了，只剩下 Content-Type = binary/octet-stream

据我所知，由于 PIL 的工作方式， image.save 正在丢失标题。如何保留元数据或至少将其应用于新保存的图像？

我看过帖子暗示这个元数据在 exif 中，但我试图从原始文件中获取 exif 信息并应用到保存的文件中，但没有运气。反正我不清楚它在exif数据中。

部分代码给出了我在做什么的想法：

def resize_image(image_path):
    with Image.open(image_path) as image:
    image.save(upload_path, optimize=True)

def handler(event, context):
    global upload_path
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode("utf8"))

        download_path = '/tmp/{}{}'.format(uuid.uuid4(), file_name)
        upload_path = '/tmp/resized-{}'.format(file_name)

        s3_client.download_file(bucket, key, download_path)

        resize_image(download_path)
        s3_client.upload_file(upload_path, '{}resized'.format(bucket), key)

Run Code Online (Sandbox Code Playgroud)

感谢 Sergey，我改为使用 get_object 但响应缺少元数据：

response = s3_client.get_object(Bucket=bucket,Key=key)

Run Code Online (Sandbox Code Playgroud)

response= {u'Body': , u'AcceptRanges': 'bytes', u'ContentType': 'image/jpeg', 'ResponseMetadata': {'HTTPStatusCode': 200, 'RetryAttempts': 0, 'HostId': 'au30hBMN37 / ti0WCfDqlb3t9ehainumc9onVYWgu + CsrHtvG0u / zmgcOIvCCBKZgQrGoooZoW9o ='， '的requestId'： '1A94D7F01914A787'， 'HTTPHeaders'：{ '内容长度'： '84053'， 'X-AMZ-ID-2'：“au30hBMN37 / ti0WCfDqlb3t9ehainumc9onVYWgu + CsrHtvG0u /zmgcOIvCCBKZgQrGoooZoW9o=', 'accept-ranges': 'bytes', 'expires': 'Sun, 01 Jan 2034 00:00:00 GMT', 'server': 'AmazonS3', 'last-modified': 'Fri, 2016 年 12 月 23 日 15:21:56 GMT'，'x-amz-request-id'：'1A94D7F01914A787'，'etag'：'"9ba59e5457da0dc40357f2b53715619d"', 'cache-control': 'max-age=2592000,public', 'date': 'Fri, 23 Dec 2016 15:21:58 GMT', 'content-type': }, u'LastModified': datetime.datetime(2016, 12, 23, 15, 21, 56, tzinfo=tzutc()), u'ContentLength': 84053, u'Expires': datetime.datetime(2034, 1, 1, 0, 0, tzinfo=tzutc()), u'ETag': '"9ba59e5457da0dc40357f2b53715619d"', u'CacheControl': 'max-age=2592000,public', u'Metadata': {}}过期': datetime.datetime(2034, 1, 1, 0, 0, tzinfo=tzutc()), u'ETag': '"9ba59e5457da0dc40357f2b53715619d"', u'CacheControl': 'max-age00,259public你'元数据'：{}}过期': datetime.datetime(2034, 1, 1, 0, 0, tzinfo=tzutc()), u'ETag': '"9ba59e5457da0dc40357f2b53715619d"', u'CacheControl': 'max-age00,259public你'元数据'：{}}

如果我使用： metadata = response['ResponseMetadata']['HTTPHeaders']

元数据 = {'内容长度'：'84053'，'x-amz-id-2'：'f5UAhWzx7lulo3cMVF8hdVRbHnhdnjHWRDl+LDFkYm9pubjL0A01L5yWjgDjWRE4TjRnjqDeA0U'，'JanjqDeA0U'，'JanjqDeA0U'，'Janirs0'， 00:00:00 GMT', 'server': 'AmazonS3', 'last-modified': 'Fri, 23 Dec 2016 15:47:09 GMT', 'x-amz-request-id': '4C69DF8A58EF3380', 'etag': '"9ba59e5457da0dc40357f2b53715619d"', 'cache-control': 'max-age=2592000,public', 'date': 'Fri, 23 Dec 2016 15:47:10 GMT-type', '图像/jpeg'}

使用 put_object 保存

s3_client.put_object(Bucket=bucket+'resized',Key=key, Metadata=metadata, Body=downloadfile)

Run Code Online (Sandbox Code Playgroud)

在 s3 中创建了大量额外的元数据，包括它不会将内容类型保存为图像/jpeg，而是将其保存为二进制/八位字节流，并且它确实创建了元数据 x-amz-meta-content-type = image/jpeg

Answer 1

Ser*_*lev 5

您混淆了由 AWS S3 与对象一起存储的 S3 元数据和存储在文件本身内部的 EXIF 元数据。

download_file()不会从 S3 获取对象属性。您应该get_object()改用：https : //boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.get_object

然后您可以使用put_objects()相同的属性上传新文件：https : //boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.put_object

归档时间：	9 年，2 月前
查看次数：	3336 次
最近记录：	6 年，1 月前