小编nav*_*hai的帖子

使用GridSearch时使用Scikit-learn的模型帮助

作为安然项目的一部分,构建了附加模型,下面是步骤的摘要,

以下型号给出了非常完美的分数

cv = StratifiedShuffleSplit(n_splits = 100, test_size = 0.2, random_state = 42)
gcv = GridSearchCV(pipe, clf_params,cv=cv)

gcv.fit(features,labels) ---> with the full dataset

for train_ind, test_ind in cv.split(features,labels):
    x_train, x_test = features[train_ind], features[test_ind]
    y_train, y_test = labels[train_ind],labels[test_ind]

    gcv.best_estimator_.predict(x_test)
Run Code Online (Sandbox Code Playgroud)

下面的模型给出了更合理但低分

cv = StratifiedShuffleSplit(n_splits = 100, test_size = 0.2, random_state = 42)
gcv = GridSearchCV(pipe, clf_params,cv=cv)

gcv.fit(features,labels) ---> with the full dataset

for train_ind, test_ind in cv.split(features,labels):
     x_train, x_test = features[train_ind], features[test_ind]
     y_train, y_test = labels[train_ind],labels[test_ind]

     gcv.best_estimator_.fit(x_train,y_train)
     gcv.best_estimator_.predict(x_test)
Run Code Online (Sandbox Code Playgroud)
  1. 使用Kbest查找分数并对功能进行排序并尝试更高和更低分数的组合.

  2. 使用StratifiedShuffle将SVM与GridSearch一起使用

  3. 使用best_estimator_来预测和计算精度和召回率.

问题是估算器正在吐出完美的分数,在某些情况下是1 …

python machine-learning scikit-learn cross-validation grid-search

3
推荐指数
2
解决办法
2425
查看次数