相关疑难解决方法(0)

管道:多个分类器？

我在Python中阅读以下关于Pipelines和GridSearchCV的示例:http://www.davidsbatista.net/blog/2017/04/01/document_classification/

Logistic回归:

pipeline = Pipeline([
    ('tfidf', TfidfVectorizer(stop_words=stop_words)),
    ('clf', OneVsRestClassifier(LogisticRegression(solver='sag')),
])
parameters = {
    'tfidf__max_df': (0.25, 0.5, 0.75),
    'tfidf__ngram_range': [(1, 1), (1, 2), (1, 3)],
    "clf__estimator__C": [0.01, 0.1, 1],
    "clf__estimator__class_weight": ['balanced', None],
}

Run Code Online (Sandbox Code Playgroud)

SVM:

pipeline = Pipeline([
    ('tfidf', TfidfVectorizer(stop_words=stop_words)),
    ('clf', OneVsRestClassifier(LinearSVC()),
])
parameters = {
    'tfidf__max_df': (0.25, 0.5, 0.75),
    'tfidf__ngram_range': [(1, 1), (1, 2), (1, 3)],
    "clf__estimator__C": [0.01, 0.1, 1],
    "clf__estimator__class_weight": ['balanced', None],
}

Run Code Online (Sandbox Code Playgroud)

有没有一种方法可以将Logistic回归和SVM组合成一个管道？比方说,我有一个TfidfVectorizer,喜欢测试多个分类器,然后每个分类器输出最好的模型/参数.

python pipeline scikit-learn grid-search

Chr*_*her

2018 12-26

5
推荐指数

3
解决办法

4223
查看次数

“并行”管道使用gridsearch获得最佳模型

在sklearn中，可以定义串行管道，以使管道的所有连续部分都获得超参数的最佳组合。串行管道可以实现如下：

from sklearn.svm import SVC
from sklearn import decomposition, datasets
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV

digits = datasets.load_digits()
X_train = digits.data
y_train = digits.target

#Use Principal Component Analysis to reduce dimensionality
# and improve generalization
pca = decomposition.PCA()
# Use a linear SVC
svm = SVC()
# Combine PCA and SVC to a pipeline
pipe = Pipeline(steps=[('pca', pca), ('svm', svm)])
# Check the training time for the SVC
n_components = [20, 40, 64]
params_grid = {
'svm__C': …

Run Code Online (Sandbox Code Playgroud)

python machine-learning scikit-learn grid-search

use*_*212

2018 04-16

0
推荐指数

1
解决办法

1885
查看次数