使用分组的K折Cv生成器进行sklearn网格搜索

use*_*er0 4 scikit-learn cross-validation

我正在尝试使用随机搜索和分组的K折交叉验证生成器在sklearn中的参数上实现网格搜索。以下作品:

skf=StratifiedKFold(n_splits=5,shuffle=True,random_state=0)
rs=sklearn.model_selection.RandomizedSearchCV(clf,parameters,scoring='roc_auc',cv=skf,n_iter=10)
rs.fit(X,y)
Run Code Online (Sandbox Code Playgroud)

这不是

gkf=GroupKFold(n_splits=5)
rs=sklearn.model_selection.RandomizedSearchCV(clf,parameters,scoring='roc_auc',cv=gkf,n_iter=10)
rs.fit(X,y)

#ValueError: The groups parameter should not be None
Run Code Online (Sandbox Code Playgroud)

如何指示groups参数?

这也不

gkf=GroupKFold(n_splits=5)
fv = gkf.split(X, y, groups=groups)
rs=sklearn.model_selection.RandomizedSearchCV(clf,parameters,scoring='roc_auc',cv=gkf,n_iter=10)
rs.fit(X,y)

#TypeError: object of type 'generator' has no len()
Run Code Online (Sandbox Code Playgroud)

use*_*er0 5

供参考,这是通过

rs.fit(X,y,groups=groups)
Run Code Online (Sandbox Code Playgroud)

对于

rs=sklearn.model_selection.RandomizedSearchCV(forest,parameters,scoring='roc_auc',cv=gkf,n_iter=10)
Run Code Online (Sandbox Code Playgroud)