获取名称为 mlflow 实验的运行 ID？

Question

获取名称为 mlflow 实验的运行 ID？

我目前在 mlflow 中创建了一个实验，并在实验中创建了多次运行。

from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
import mlflow

experiment_name="experiment-1"
mlflow.set_experiment(experiment_name)

no_of_trees=[100,200,300]
depths=[2,3,4]
for trees in no_of_trees:
    for depth in depths:
        with mlflow.start_run() as run:
            model=RandomForestRegressor(n_estimators=trees, criterion='mse',max_depth=depth)
            model.fit(x_train, y_train)
            predictions=model.predict(x_cv)
            mlflow.log_metric('rmse',mean_squared_error(y_cv, predictions))

Run Code Online (Sandbox Code Playgroud)

创建运行后，我想获得此实验的最佳 run_id。现在，我可以通过查看 mlflow 的 UI 来获得最佳运行效果，但是我们如何才能正确地执行程序？

Answer 1

Rav*_*i G 5

我们可以从实验名称中获取实验 id，我们可以使用 python API 来获得最佳运行。

experiment_name = "experiment-1"
current_experiment=dict(mlflow.get_experiment_by_name(experiment_name))
experiment_id=current_experiment['experiment_id']

Run Code Online (Sandbox Code Playgroud)

通过使用实验 id，我们可以获得所有运行，我们可以根据如下指标对它们进行排序。在下面的代码中，rmse 是我的指标名称（因此根据指标名称，您可能会有所不同）

df = mlflow.search_runs([experiment_id], order_by=["metrics.rmse DESC"])
best_run_id = df.loc[0,'run_id']

Run Code Online (Sandbox Code Playgroud)

需要在 Azure Databricks 中提供实验的完整路径来代替实验名称。 (2认同)

Answer 2

Mac*_*ski 5

直接按名称或其他属性搜索，使用filter_string：

mlflow.search_runs(filter_string="run_name='CV_M1_A1_regional'")['run_id'] #539aa3507ba54ebf86e64c7c9766fcee

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，1 月前
查看次数：	1054 次
最近记录：	5 年，1 月前