我在 scikit-learn 中创建了一个管道,如下所示:
estimators2 = [
('tfidf', TfidfVectorizer(tokenizer=lambda string: string.split())),
('clf', SGDClassifier(n_jobs=13, early_stopping=True, class_weight='balanced'))
]
parameters2 = {
'tfidf__min_df': np.arange(10, 30, 10),
'tfidf__max_df': np.arange(0.75, 0.9, 0.05),
'tfidf__ngram_range': [(1, 1), (2, 2), (3, 3)],
'clf__alpha': (1e-2, 1e-3)
}
p2 = Pipeline(estimators2)
grid2 = RandomizedSearchCV(p2, param_distributions=parameters2,
scoring='balanced_accuracy', n_iter=20, cv=3, n_jobs=13, pre_dispatch='n_jobs')
Run Code Online (Sandbox Code Playgroud)
在这个管道中,有两次参数n_jobs?scikit-learn 如何处理它们?