我需要传递一个参数,sample_weight我RandomForestClassifier喜欢这样:
X = np.array([[2.0, 2.0, 1.0, 0.0, 1.0, 3.0, 3.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 5.0, 3.0,
2.0, '0'],
[15.0, 2.0, 5.0, 5.0, 0.466666666667, 4.0, 3.0, 2.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0,
7.0, 14.0, 2.0, '0'],
[3.0, 4.0, 3.0, 1.0, 1.33333333333, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, …Run Code Online (Sandbox Code Playgroud) early-stopping我想在 scikit-learns -方法中使用 - 选项GridSearchCV。SO- thread中显示了一个这样的示例:
import xgboost as xgb
from sklearn.model_selection import GridSearchCV
trainX= [[1], [2], [3], [4], [5]]
trainY = [1, 2, 3, 4, 5]
testX = trainX
testY = trainY
param_grid = {"subsample" : [0.5, 0.8],
"n_estimators" : [600]}
fit_params = {"early_stopping_rounds":1,
"eval_set" : [[testX, testY]]}
model = xgb.XGBRegressor()
gridsearch = GridSearchCV(estimator = xgb.XGBRegressor(),
param_grid=param_grid,
fit_params=fit_params,
verbose=1,
cv=2)
gridsearch.fit(trainX,trainY)
Run Code Online (Sandbox Code Playgroud)
但是,我想使用交叉验证过程的保留集作为验证集。有没有办法在 中指定这一点GridSearchCV?