下载 Azure 存储容器中的所有 blob

pri*_*r35 2 python bash containers azure

我已经成功编写了一个 python 脚本来列出容器内的所有 blob。

import azure
from azure.storage.blob import BlobService
from azure.storage import *

blob_service = BlobService(account_name='<CONTAINER>', account_key='<ACCOUNT_KEY>')


blobs = []
marker = None
while True:
    batch = blob_service.list_blobs('<CONAINER>', marker=marker)
    blobs.extend(batch)
    if not batch.next_marker:
        break
    marker = batch.next_marker
for blob in blobs:
    print(blob.name)
Run Code Online (Sandbox Code Playgroud)

就像我说的,这只列出了我想要下载的 blob。我已转向 Azure CLI,看看这是否可以帮助我完成我想做的事情。我可以使用以下命令下载单个 blob

azure storage blob download [container]
Run Code Online (Sandbox Code Playgroud)

然后它提示我指定一个可以从 python 脚本中获取的 blob。我必须下载所有这些 blob 的方法是将它们复制并粘贴到上面使用的命令之后的提示中。有没有办法我可以:

A。编写一个 bash 脚本,通过执行命令来迭代 blob 列表,然后在提示符中粘贴下一个 blob 名称。

B.​ 指定在 python 脚本或 Azure CLI 中下载容器。有什么东西我没有看到下载整个容器吗?

Bri*_*SFT 5

@gary-liu-msft 解决方案是正确的。我对其进行了更多更改,现在代码可以迭代容器及其中的文件夹结构(PS - 容器中没有文件夹,只有路径),检查客户端中是否存在相同的目录结构,如果不存在则创建该目录结构并下载这些路径中的 blob。它支持带有嵌入子目录的长路径。

from azure.storage.blob import BlockBlobService
from azure.storage.blob import PublicAccess
import os

#name of your storage account and the access key from Settings->AccessKeys->key1
block_blob_service = BlockBlobService(account_name='storageaccountname', account_key='accountkey')

#name of the container
generator = block_blob_service.list_blobs('testcontainer')

#code below lists all the blobs in the container and downloads them one after another
for blob in generator:
    print(blob.name)
    print("{}".format(blob.name))
    #check if the path contains a folder structure, create the folder structure
    if "/" in "{}".format(blob.name):
        print("there is a path in this")
        #extract the folder path and check if that folder exists locally, and if not create it
        head, tail = os.path.split("{}".format(blob.name))
        print(head)
        print(tail)
        if (os.path.isdir(os.getcwd()+ "/" + head)):
            #download the files to this directory
            print("directory and sub directories exist")
            block_blob_service.get_blob_to_path('testcontainer',blob.name,os.getcwd()+ "/" + head + "/" + tail)
        else:
            #create the diretcory and download the file to it
            print("directory doesn't exist, creating it now")
            os.makedirs(os.getcwd()+ "/" + head, exist_ok=True)
            print("directory created, download initiated")
            block_blob_service.get_blob_to_path('testcontainer',blob.name,os.getcwd()+ "/" + head + "/" + tail)
    else:
        block_blob_service.get_blob_to_path('testcontainer',blob.name,blob.name)
Run Code Online (Sandbox Code Playgroud)

相同的代码也可以在这里https://gist.github.com/brijrajsingh/35cd591c2ca90916b27742d52a3cf6ba