目前,我正在建立数据严重不平衡的分类器。我正在使用imblearn管道首先到达StandardScaling,SMOTE,然后使用gridSearchCV进行分类。这确保了在交叉验证期间完成了升采样。现在,我想将feature_selection包含到我的管道中。我应该如何将这一步骤纳入管道中?
model = Pipeline([
('sampling', SMOTE()),
('classification', RandomForestClassifier())
])
param_grid = {
'classification__n_estimators': [10, 20, 50],
'classification__max_depth' : [2,3,5]
}
gridsearch_model = GridSearchCV(model, param_grid, cv = 4, scoring = make_scorer(recall_score))
gridsearch_model.fit(X_train, y_train)
predictions = gridsearch_model.predict(X_test)
print(classification_report(y_test, predictions))
print(confusion_matrix(y_test, predictions))
Run Code Online (Sandbox Code Playgroud)