MLFLOW 工件存储在 ftp 服务器上但未显示在 ui 中

Wil*_*ner 6 python ftp machine-learning pytorch mlflow

我在远程跟踪服务器上训练期间使用 MLFLOW 存储一些参数和指标。现在我还尝试添加一个 .png 文件作为工件,但由于 MLFLOW 服务器远程运行,我将该文件存储在 ftp 服务器上。我通过以下方式给出了 ftp 服务器地址和 MLFLOW 路径:

mlflow server --backend-store-uri sqlite:///mlflow.sqlite --default-artifact-root ftp://user:password@1.2.3.4/artifacts/ --host 0.0.0.0 &
Run Code Online (Sandbox Code Playgroud)

现在我训练一个网络并通过运行以下命令来存储工件:

mlflow.set_tracking_uri(remote_server_uri)
mlflow.set_experiment("default")
mlflow.pytorch.autolog()

with mlflow.start_run():
    mlflow.log_params(flow_params)
    trainer.fit(model)
    trainer.test()
    mlflow.log_artifact("confusion_matrix.png")
mlflow.end_run()
Run Code Online (Sandbox Code Playgroud)

我将 .png 文件保存在本地,然后将其记录到mlflow.log_artifact("confusion_matrix.png")ftp 服务器上与实验对应的右侧文件夹中。到目前为止,一切正常,只是该工件没有显示在在线 mlflow ui 中。记录的参数和指标正常显示。工件面板保持空白,仅显示

No Artifacts Recorded
Use the log artifact APIs to store file outputs from MLflow runs.
Run Code Online (Sandbox Code Playgroud)

我发现了类似的线程,但仅限于在本地 mlflow 存储上遇到相同问题的用户。不幸的是,我无法将这些修复应用于我的问题。有人知道如何解决这个问题吗?

小智 -3

from ftplib import FTP
import mlflow

ftp_server = "your_server"
ftp_username = "your_username"
ftp_password = "your_password"
ftp_artifact_path = "path/to/artifact"

def check_artifact_path():
    try:
        with FTP(ftp_server) as ftp:
            ftp.login(ftp_username, ftp_password)
            ftp.cwd(ftp_artifact_path)
            print("Checking artifact path... Success")
    except Exception as e:
        print(f"Checking artifact path... Failed - {str(e)}")

def verify_ftp_connection():
    try:
        with FTP(ftp_server) as ftp:
            ftp.login(ftp_username, ftp_password)
            print("Verifying FTP server connection... Success")
    except Exception as e:
        print(f"Verifying FTP server connection... Failed - {str(e)}")

def check_mlflow_version():
    try:
        mlflow_version = mlflow.__version__
        print(f"Checking MLflow version... Version: {mlflow_version}")
    except Exception as e:
        print(f"Checking MLflow version... Failed - {str(e)}")

if __name__ == "__main__":
    check_artifact_path()
    verify_ftp_connection()
    check_mlflow_version()

    print("Script execution complete.")
Run Code Online (Sandbox Code Playgroud)

Dis Python sript 就像您的技术伙伴一样,确保在 FTP 服务器上使用 MLflow 时一切顺利。首先,它检查是否可以使用提供的信息顺利连接到 FTP 服务器。然后,它确保服务器上指定的工件路径可以正常运行。最后,它检查并打印 MLflow 库版本以确保其全部兼容。

每次检查后,它都会向您提供成功或任何问题的内幕。简而言之,该脚本确保一切设置正确,以便将 MLflow 工件登录到该 FTP 服务器上。这是您科技之旅的首选帮助!

您可以运行 powershell 脚本来检查其工作流程并解决您面临的可能错误

# Replace these with your actual values
$mlflowServerUri = "http://your-mlflow-server-uri"
$experimentName = "default"
$runId = "your-run-id"

# Checking out your runs...
$runs = Invoke-RestMethod -Uri "$mlflowServerUri/api/2.0/mlflow/runs/list?experiment_id=$experimentName" -Method Get

# Checking the deets of your run...
$run = $runs.data | Where-Object { $_.info.run_id -eq $runId }
if ($run) {
    Write-Host "here's the scoop on your run:"
    Write-Host "Run ID: $($run.info.run_id)"
    Write-Host "Status: $($run.info.status)"
    Write-Host "Started at: $($run.info.start_time)"
    Write-Host "Ended at: $($run.info.end_time)"
    
    # Now, let's peek at the artifacts...
    $artifacts = Invoke-RestMethod -Uri "$mlflowServerUri/api/2.0/mlflow/artifacts/list?run_id=$runId" -Method Get
    Write-Host "`nArtifacts in the spotlight:"
    foreach ($artifact in $artifacts.data) {
        Write-Host "Artifact Path: $($artifact.info.path)"
        Write-Host "Artifact Size: $($artifact.info.file_size)"
        Write-Host "Last Modified: $($artifact.info.mtime)"
        Write-Host "Artifact URI: $($artifact.info.artifact_uri)"
        Write-Host "View it here: $($artifact.info.view_uri)`n"
    }
}
else {
    Write-Host "Hmm, can't find a run with ID $runId in the $experimentName experiment. Check your IDs!"
}
Run Code Online (Sandbox Code Playgroud)