使用 Azure Python 函数和托管标识从存储帐户下载

Chr*_*mus 4 azure-storage azure-data-lake azure-functions azure-managed-identity azure-rbac

我创建了一个用 Python 编写的名为“transformerfunction”的 Azure 函数,它应该将数据上传和下载到 Azure 数据湖/存储。我还打开了系统分配的托管标识,并在我的存储帐户中为该函数授予了“存储 Blob 数据贡献者”角色权限:

在此输入图像描述

为了验证和下载文件,我基本上按照以下文档使用这部分代码:

managed_identity = ManagedIdentityCredential()
credential_chain = ChainedTokenCredential(managed_identity)
client = DataLakeServiceClient(account_url, credential=credential_chain)

file_client = client.get_file_client(file_system_container, file_name)
downloaded_file = file_client.download_file()
downloaded_file.readinto(f)
Run Code Online (Sandbox Code Playgroud)

如果我的理解是正确的,Azure 应该使用该函数的身份进行身份验证,并且由于该身份具有存储上的存储 Blob 数据贡献者权限,因此下载应该可以工作。

但是,当我调用该函数并查看日志时,我看到的是:

2020-11-23 20:04:11.396 Function called
2020-11-23 20:04:11.397 ManagedIdentityCredential will use App Service managed identity
2020-11-23 20:04:13.105
Result: Failure Exception: HttpResponseError: This request is not authorized to perform this operation. 
RequestId:1f6a2a1c-b01e-0090-26d3-c1d0c0000000 Time:2020-11-23T20:04:13.0679405Z ErrorCode:AuthorizationFailure Error:None Stack:
File "/azure-functions-host/workers/python/3.6/LINUX/X64/azure_functions_worker/dispatcher.py", line 357, in _handle__invocation_request self.__run_sync_func, invocation_id, fi.func, args)
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run result = self.fn(*self.args, **self.kwargs)
File "/azure-functions-host/workers/python/3.6/LINUX/X64/azure_functions_worker/dispatcher.py", line 542, in __run_sync_func return func(**params)
File "/home/site/wwwroot/shared/datalake.py", line 65, in download downloaded_file = client.download_file()
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/filedatalake/_data_lake_file_client.py", line 593, in download_file downloader = self._blob_client.download_blob(offset=offset, length=length, **kwargs)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/core/tracing/decorator.py", line 83, in wrapper_use_tracer return func(*args, **kwargs)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_blob_client.py", line 674, in download_blob return StorageStreamDownloader(**options)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_download.py", line 316, in __init__ self._response = self._initial_request()
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_download.py", line 403, in _initial_request process_storage_error(error)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_shared/response_handlers.py", line 147, in process_storage_error raise error
Run Code Online (Sandbox Code Playgroud)

这非常清楚地表明该函数无权下载 blob。但为什么?我必须采取什么不同的做法?

编辑:

我找到了问题的原因:我在网络设置中限制了我的 Data Lake 存储,如下所示:

在此输入图像描述

我的假设是“允许受信任的 Microsoft 服务访问此存储帐户”将始终允许在 Azure 上运行的函数访问存储,无论是否选择网络或选择哪个网络 - 事实并非如此。

Sta*_*ong 5

不确定你这边的原因,但下面的代码非常适合我:

import azure.functions as func
import json
from azure.identity import ChainedTokenCredential,ManagedIdentityCredential
from azure.storage.filedatalake import DataLakeServiceClient



def main(req: func.HttpRequest) -> func.HttpResponse:
    
    MSI_credential = ManagedIdentityCredential()
    
    credential_chain = ChainedTokenCredential(MSI_credential)

    client = DataLakeServiceClient("https://<Azure Data Lake Gen2 account name>.dfs.core.windows.net", credential=credential_chain)

    file_client = client.get_file_client("container name", "filename.txt")
    stream = file_client.download_file()
 
    return func.HttpResponse(stream.readall());
Run Code Online (Sandbox Code Playgroud)

我的函数 MSI 的配置: 在此输入图像描述 在此输入图像描述

我的测试文件的内容: 在此输入图像描述

测试结果 : 在此输入图像描述