使用 Boto3 获取特定 S3 文件夹中的对象数

Question

使用 Boto3 获取特定 S3 文件夹中的对象数

尝试获取 S3 文件夹中的对象数

当前代码

bucket='some-bucket'
File='someLocation/File/'

objs = boto3.client('s3').list_objects_v2(Bucket=bucket,Prefix=File)
fileCount = objs['KeyCount']

Run Code Online (Sandbox Code Playgroud)

这给了我 1+S3 中对象的实际数量的计数。

也许它也将“文件”视为键？

Answer 1

vag*_*ond 8

假设您想要对存储桶中的键进行计数，并且不想使用达到 1000 个的限制list_objects_v2。下面的代码对我有用，但我想知道是否有更好更快的方法来做到这一点！尝试查看 boto3 s3 连接器中是否有封装函数，但没有！

# connect to s3 - assuming your creds are all set up and you have boto3 installed
s3 = boto3.resource('s3')

# identify the bucket - you can use prefix if you know what your bucket name starts with
for bucket in s3.buckets.all():
    print(bucket.name)

# get the bucket
bucket = s3.Bucket('my-s3-bucket')

# use loop and count increment
count_obj = 0
for i in bucket.objects.all():
    count_obj = count_obj + 1
print(count_obj)

Run Code Online (Sandbox Code Playgroud)

计算可迭代对象的更好、更高效的方法是使用 sum() ： count_obj = sum(1 for _ in bucket.objects.all()) (4认同)

Answer 2

mat*_*rns 5

如果条目超过 1000 个，则需要使用分页器，如下所示：

count = 0
client = boto3.client('s3')
paginator = client.get_paginator('list_objects')
for result in paginator.paginate(Bucket='your-bucket', Prefix='your-folder/', Delimiter='/'):
    count += len(result.get('CommonPrefixes'))

Run Code Online (Sandbox Code Playgroud)

Answer 3

Joh*_*ein 4

“文件夹”实际上并不存在于 Amazon S3 中。相反，所有对象都将其完整路径作为其文件名（“Key”）。我想你已经知道这一点了。

但是，可以通过创建与文件夹同名的零长度对象来“创建”文件夹。这会导致文件夹出现在列表中，如果通过管理控制台创建文件夹，也会发生这种情况。

因此，您可以从计数中排除零长度对象。

有关示例，请参阅：确定文件夹或文件密钥 - Boto

归档时间：	7 年，1 月前
查看次数：	11285 次
最近记录：	6 年，2 月前