相关疑难解决方法(0)

将自定义函数放入 Sklearn 管道中

在我的分类方案中，有几个步骤，包括：

SMOTE（合成少数过采样技术）
特征选择的 Fisher 标准
标准化（Z-score 标准化）
SVC（支持向量分类器）

在上述方案中要调整的主要参数是百分位数 (2.) 和 SVC (4.) 的超参数，我想通过网格搜索进行调整。

当前的解决方案构建了一个“部分”管道，包括方案中的第 3 步和第 4 步，并将方案clf = Pipeline([('normal',preprocessing.StandardScaler()),('svc',svm.SVC(class_weight='auto'))]) 分解为两部分：

调整特征的百分位数以保持通过第一次网格搜索

skf = StratifiedKFold(y)
for train_ind, test_ind in skf:
    X_train, X_test, y_train, y_test = X[train_ind], X[test_ind], y[train_ind], y[test_ind]
    # SMOTE synthesizes the training data (we want to keep test data intact)
    X_train, y_train = SMOTE(X_train, y_train)
    for percentile in percentiles:
        # Fisher returns the indices of the selected features specified by the parameter 'percentile'
        selected_ind = Fisher(X_train, …

Run Code Online (Sandbox Code Playgroud)

pipeline machine-learning feature-selection scikit-learn cross-validation

Fra*_*cis

2021 06-18

10
推荐指数

2
解决办法

2万
查看次数