小编Fra*_*ina的帖子

使用 RandomizedSearchCV 调整 XGBoost 超参数

我正在尝试将 XGBoost 用于包含大约 500,000 个观察值和 10 个特征的特定数据集。我正在尝试使用进行一些超参数调整RandomizedSeachCV，并且具有最佳参数的模型的性能比具有默认参数的模型的性能差。

具有默认参数的模型：

model = XGBRegressor()
model.fit(X_train,y_train["speed"])
y_predict_speed = model.predict(X_test)

from sklearn.metrics import r2_score
print("R2 score:", r2_score(y_test["speed"],y_predict_speed, multioutput='variance_weighted'))
R2 score: 0.3540656307310167

Run Code Online (Sandbox Code Playgroud)

随机搜索的最佳模型：

booster=['gbtree','gblinear']
base_score=[0.25,0.5,0.75,1]

## Hyper Parameter Optimization
n_estimators = [100, 500, 900, 1100, 1500]
max_depth = [2, 3, 5, 10, 15]
booster=['gbtree','gblinear']
learning_rate=[0.05,0.1,0.15,0.20]
min_child_weight=[1,2,3,4]

# Define the grid of hyperparameters to search
hyperparameter_grid = {
    'n_estimators': n_estimators,
    'max_depth':max_depth,
    'learning_rate':learning_rate,
    'min_child_weight':min_child_weight,
    'booster':booster,
    'base_score':base_score
    }

# Set up the random search with 4-fold …

Run Code Online (Sandbox Code Playgroud)

python machine-learning scikit-learn xgboost

Fra*_*ina

2021 11-01

9
推荐指数

1
解决办法

2万
查看次数