调用 CompleteMultipartUpload 操作时发生错误 (EntityTooSmall):您建议的上传小于允许的最小大小

ca9*_*3d9 2 python amazon-s3 amazon-web-services

我有以下使用 MultipartUpload 上传 s3 的代码。

import logging
import boto3


class UploadS3:
    def __init__(self, bucket, prefix):
        self.s3 = boto3.resource('s3')
        self.bucket = bucket
        self.prefix = prefix

    def start(self, key):
        '''Start to upload a new file'''
        self.part_no = 1
        self.parts = []
        key_path = f'{self.prefix}/{key}'
        self.s3obj = self.s3.Object(self.bucket, key_path)
        self.mpu = self.s3obj.initiate_multipart_upload()
        self.buffer = bytearray()

    def upload(self, chunk):
        '''Upload a chunk'''
        if len(self.buffer) >= 5_000_000:
            self._upload_buffer()
        self.buffer += chunk

    def end(self, part_info={}):
        if len(self.buffer):
            self._upload_buffer()
        part_info['Parts'] = self.parts
        mpu_result = self.mpu.complete(MultipartUpload=part_info)
        logging.info(f'Upload result: {mpu_result}')

    def _upload_buffer(self):
        self.part = self.mpu.Part(self.part_no)
        print(f'buffer len: {len(self.buffer)}')
        resp = self.part.upload(Body=self.buffer)
        print({'PartNumber': self.part_no, 'ETag': resp['ETag']})
        self.parts.append({'PartNumber': self.part_no, 'ETag': resp['ETag']})
        self.part_no += 1
        self.buffer = bytearray()
Run Code Online (Sandbox Code Playgroud)

我创建了以下测试代码:

upload_s3 = UploadS3(BUCKET, PREFIX)
key = 'key2'
upload_s3.start(key)
upload_s3.upload(b'0' * 1_000_000)
upload_s3.upload(b'1' * 1_000_000)
upload_s3.upload(b'2' * 1_000_000)
upload_s3.upload(b'3' * 1_000_000)
upload_s3.upload(b'4' * 999_999)
upload_s3.upload(b'abcde')
upload_s3.upload(b'12345')
upload_s3.end({})
Run Code Online (Sandbox Code Playgroud)

但是,它出现以下错误。第一部分的长度是5000004,第二部分(最后)的长度是5,不需要超过5M?

buffer len: 5000004
{'PartNumber': 1, 'ETag': '"e616f253def9510e3be2af0854e4c992"'}
buffer len: 5
{'PartNumber': 2, 'ETag': '"db44331bface5c8678770426baf73bc2"'}
Traceback (most recent call last):
  File "test1.py", line 35, in <module>
    main()
  File "test1.py", line 31, in main
    upload_s3.end({})
  File "/home/x/upload_s3.py", line 31, in end
    mpu_result = self.mpu.complete(MultipartUpload=part_info)
  File "/apps/external/4/anaconda3/lib/python3.6/site-packages/boto3/resources/factory.py", line 520, in do_action
    response = action(self, *args, **kwargs)
  File "/apps/external/4/anaconda3/lib/python3.6/site-packages/boto3/resources/action.py", line 83, in __call__
    response = getattr(parent.meta.client, operation_name)(*args, **params)
  File "/apps/external/4/anaconda3/lib/python3.6/site-packages/botocore/client.py", line 386, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/apps/external/4/anaconda3/lib/python3.6/site-packages/botocore/client.py", line 705, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (EntityTooSmall) when calling the CompleteMultipartUpload operation: Your proposed upload is smaller than the minimum allowed size
Run Code Online (Sandbox Code Playgroud)

Dee*_*ace 6

截至撰写本答案时,S3 分段上传限制页面具有下表:

物品 规格
最大物体尺寸 5TB
每次上传的最大片段数 10,000
零件号 1至10,000(含)
零件尺寸 5 MB 至 5 GB。分段上传的最后一部分没有最小大小限制。
列表部件请求返回的最大部件数 1000
列表分段上传请求中返回的最大分段上传数 1000

然而,有一个微妙的错误。它说的是 5 MB 而不是 5 MiB(可能 5 GB 实际上应该是 5 GiB)。

由于您将各个部分分割为每个5 000 000字节(为 5 MB,但“仅”约 4.77 MiB),因此第一部分和第二部分都小于最小大小。

相反,您应该每隔5 242 880( 5 * 1024 ** 2) 个字节分割这些部分(或者为了安全起见,甚至稍微[无双关语])。

我在 S3 文档页面上提交了拉取请求。