Azure Blob - 使用Python读取

Ang*_*Sen 12 python azure azure-storage-blobs

有人能告诉我是否可以直接从Azure blob存储中读取csv文件作为流并使用Python处理它?我知道它可以使用C#.Net(如下所示)完成,但想知道Python中的等效库来执行此操作.

CloudBlobClient client = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = client.GetContainerReference("outfiles");
CloudBlob blob = container.GetBlobReference("Test.csv");*
Run Code Online (Sandbox Code Playgroud)

Gau*_*tri 10

是的,当然可以这样做.查看Azure Storage SDK for Python

from azure.storage.blob import BlockBlobService

block_blob_service = BlockBlobService(account_name='myaccount', account_key='mykey')

block_blob_service.get_blob_to_path('mycontainer', 'myblockblob', 'out-sunset.png')
Run Code Online (Sandbox Code Playgroud)

您可以在此处阅读完整的SDK文档:http://azure-storage.readthedocs.io.


Seb*_*zio 7

以下是使用新版 SDK (12.0.0)执行此操作的方法:

from azure.storage.blob import BlobClient

blob = BlobClient(account_url="https://<account_name>.blob.core.windows.net"
                  container_name="<container_name>",
                  blob_name="<blob_name>",
                  credential="<account_key>")

with open("example.csv", "wb") as f:
    data = blob.download_blob()
    data.readinto(f)
Run Code Online (Sandbox Code Playgroud)

有关详细信息,请参见此处

  • 当你执行`data = blob.download_blob()`时,blob的内容将在`data`中,你不需要写入文件。 (3认同)
  • 嗨,这仍然会下载文件。是否可以在不下载文件的情况下获取blob的内容? (2认同)
  • @SebastianDziadzio 有没有办法将此数据读入 python 数据框架?我不知何故无法使用 blockblovservice 工作 (2认同)

Dan*_*l R 5

可以使用 python 从 blob 进行流式传输,如下所示:

from tempfile import NamedTemporaryFile
from azure.storage.blob.blockblobservice import BlockBlobService

entry_path = conf['entry_path']
container_name = conf['container_name']
blob_service = BlockBlobService(
            account_name=conf['account_name'],
            account_key=conf['account_key'])

def get_file(filename):
    local_file = NamedTemporaryFile()
    blob_service.get_blob_to_stream(container_name, filename, stream=local_file, 
    max_connections=2)

    local_file.seek(0)
    return local_file
Run Code Online (Sandbox Code Playgroud)

  • 很乐意提供帮助:)根据文档(https://docs.python.org/3/library/tempfile.html),它将被关闭并销毁,无需担心 (3认同)

pis*_*ete 5

我建议使用smart_open

import os

from azure.storage.blob import BlobServiceClient
from smart_open import open

connect_str = os.environ['AZURE_STORAGE_CONNECTION_STRING']
transport_params = {
    'client': BlobServiceClient.from_connection_string(connect_str),
}

# stream from Azure Blob Storage
with open('azure://my_container/my_file.txt', transport_params=transport_params) as fin:
    for line in fin:
        print(line)

# stream content *into* Azure Blob Storage (write mode):
with open('azure://my_container/my_file.txt', 'wb', transport_params=transport_params) as fout:
    fout.write(b'hello world')
Run Code Online (Sandbox Code Playgroud)