将Python DataFrame作为CSV写入Azure Blob

Ang*_*Sen 6 python azure azure-storage azure-blob-storage

我有两个关于从/向Azure blob读取和编写Python对象的问题.

1)有人能告诉我如何将Python数据帧作为csv文件直接写入Azure Blob而不在本地存储吗?

我尝试使用函数create_blob_from_text和create_blob_from_stream, 但它们都不起作用.

将数据帧转换为字符串并使用create_blob_from_text函数将文件写入blob但是作为普通字符串而不是csv.

    df_b = df.to_string()
    block_blob_service.create_blob_from_text('test', 'OutFilePy.csv', df_b)  
Run Code Online (Sandbox Code Playgroud)

2)如何直接将Azure blob存储中的json文件直接读入Python?

Jay*_*ong 10

1)有人能告诉我如何将Python数据帧作为csv文件直接写入Azure Blob而不在本地存储吗?

您可以使用pandas.DataFrame.to_csv方法.

示例代码:

from azure.storage.blob import (
    BlockBlobService
)
import pandas as pd
import io

output = io.StringIO()
head = ["col1" , "col2" , "col3"]
l = [[1 , 2 , 3],[4,5,6] , [8 , 7 , 9]]
df = pd.DataFrame (l , columns = head)
print df
output = df.to_csv (index_label="idx", encoding = "utf-8")
print(output)

accountName = "***"
accountKey = "***"
containerName = "test1"
blobName = "test3.json"

blobService = BlockBlobService(account_name=accountName, account_key=accountKey)

blobService.create_blob_from_text('test1', 'OutFilePy.csv', output)
Run Code Online (Sandbox Code Playgroud)

输出结果:

在此输入图像描述

2.如何直接将Azure blob存储中的json文件直接读入Python?

示例代码:

from azure.storage.blob import (
    BlockBlobService
)

accountName = "***"
accountKey = "***"
containerName = "test1"
blobName = "test3.json"

blobService = BlockBlobService(account_name=accountName, account_key=accountKey)

result = blobService.get_blob_to_text(containerName,blobName)

print(result.content)
Run Code Online (Sandbox Code Playgroud)

输出结果:

在此输入图像描述

希望它能帮到你.


小智 10

批准的答案对我不起作用,因为它取决于 azure-storage(自 2021 年起已弃用/遗留)包。我将其更改如下:

from azure.storage.blob import *
import dotenv
import io
import pandas as pd

dotenv.load_dotenv()
blob_block = ContainerClient.from_connection_string(
    conn_str=os.environ["CONNECTION_STRING"],
    container_name=os.environ["CONTAINER_NAME"]
    )
output = io.StringIO()
partial = df.DataFrame()
output = partial.to_csv(encoding='utf-8')
blob_block.upload_blob(name, output, overwrite=True, encoding='utf-8')
Run Code Online (Sandbox Code Playgroud)