使用Google Cloud Storage python客户端进行批量请求

con*_*lee 8 python google-cloud-storage

我找不到任何关于如何使用python google云存储的批处理功能的示例.我看到它存在于此.

我喜欢一个具体的例子.假设我想删除一堆带有给定前缀的blob.我开始按如下方式获取blob列表

from google.cloud import storage

storage_client = storage.Client()
bucket = storage_client.get_bucket('my_bucket_name')
blobs_to_delete = bucket.list_blobs(prefix="my/prefix/here")

# how do I delete the blobs in blobs_to_delete in a single batch?

# bonus: if I have more than 100 blobs to delete, handle the limitation
#        that a batch can only handle 100 operations
Run Code Online (Sandbox Code Playgroud)

Tux*_*ude 12

TL; DR - 只需在batch()上下文管理器中发送所有请求(在google-cloud-python库中可用)

试试这个例子:

from google.cloud import storage

storage_client = storage.Client()
bucket = storage_client.get_bucket('my_bucket_name')
# Accumulate the iterated results in a list prior to issuing
# batch within the context manager
blobs_to_delete = [blob for blob in bucket.list_blobs(prefix="my/prefix/here")]

# Use the batch context manager to delete all the blobs    
with storage_client.batch():
    for blob in blobs_to_delete:
        blob.delete()
Run Code Online (Sandbox Code Playgroud)

如果您直接使用REST API,则只需担心每批100个项目.该batch()上下文管理器自动进行这种限制的照顾,如果需要,会发出多个批次的请求.

  • @NKijak 文档中没有提到这一事实,但是如果您查看 [source](https://googleapis.github.io/google-cloud-python/latest/_modules/google/cloud/storage/batch.html #Batch),你可以看到 __enter__ 和 __exit__ 方法。顺便说一句,根据我的用法,如果超过最大批量大小(当前为 1000),上下文管理器将抛出异常,因此您确实需要跟踪每批的项目数。 (5认同)
  • 文档中的哪里提到它是上下文管理器? (2认同)