如何使用 Azure DevOps 工件存储库作为 AzureML DatabricksStep 的源?

Ani*_*aha 6 azure azure-devops azure-artifacts azure-databricks azure-machine-learning-service

如果我们将 PyPi 包作为工件添加到 Azure DevOps 项目源中,我们如何使用这些包作为在DatabricksStepAzure 机器学习服务中安装包的源?

在任何环境中使用时pip,我们通过以下方式使用 Azure DevOps Project Artifacts feed:

pip install example-package --index-url=https://<Personal-Access-Token>@pkgs.dev.azure.com/<Organization-Name>/_packaging/<Artifacts-Feed-Name>/pypi/simple/
Run Code Online (Sandbox Code Playgroud)

Azure 机器学习服务的 DatabricksStep 类接受以下参数:

python_script_name = "<Some-Script>.py"
source_directory = "<Path-To-Script>"

<Some-Placeholder-Name-for-the-step> = DatabricksStep(
    name=<Some-Placeholder-Name-for-the-step>,
    num_workers=1,
    python_script_name=python_script_name,
    source_directory=source_directory,
    run_name= <Name-of-the-run>,
    compute_target=databricks_compute,
    pypi_libraries = [
                      PyPiLibrary(package = 'scikit-learn'), 
                      PyPiLibrary(package = 'scipy'), 
                      PyPiLibrary(package = 'azureml-sdk'), 
                      PyPiLibrary(package = 'joblib'), 
                      PyPiLibrary(package = 'azureml-dataprep[pandas]'),
                      PyPiLibrary(package = 'example-package', repo='https://<Personal-Access-Token>@pkgs.dev.azure.com/<Organization-Name>/_packaging/<Artifacts-Feed-Name>/pypi/simple/')
                    ], 

    allow_reuse=True
)
Run Code Online (Sandbox Code Playgroud)

不过,PyPiLibrary(package = 'example-package', repo='https://<Personal-Access-Token>@pkgs.dev.azure.com/<Organization-Name>/_packaging/<Artifacts-Feed-Name>/pypi/simple/')会报错。我们究竟应该如何使用工件源作为Azure 机器学习服务中类PyPiLibrary属性的输入?DatabricksStep

Sys*_*nin 0

尝试下面的代码片段,使用私有包存储库字符串的包装函数。至少它对我有用。如果您的情况并非如此,请您提供您看到的确切错误是什么。

from azureml.pipeline.steps import DatabricksStep

def get_pypi_repo_url():
    token = "<your-secured-token-here from a keyvault>" # probably workspace object required
    return f"https://git:{token}@pkgs.dev.azure.com/bla-bla-bla/pypi/simple/"


python_script_name = "<Some-Script>.py"
source_directory = "<Path-To-Script>"

<Some-Placeholder-Name-for-the-step> = DatabricksStep(
    name=<Some-Placeholder-Name-for-the-step>,
    num_workers=1,
    python_script_name=python_script_name,
    source_directory=source_directory,
    run_name= <Name-of-the-run>,
    compute_target=databricks_compute,
    pypi_libraries=[
        PyPiLibrary("azureml.core"),
        PyPiLibrary(
            "example-package", repo=get_pypi_repo_url(),
        ),
    ], 

    allow_reuse=True
)

Run Code Online (Sandbox Code Playgroud)