加快sklearn中的网格搜索

Chr*_*rry 1 machine-learning svm scikit-learn grid-search

我正在执行网格搜索以识别最佳的SVM参数。我正在使用ipython和sklearn。该代码很慢,并且只能在一个内核上运行。如何播种并利用多个内核?谢谢

random_state = np.random.RandomState(10)
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=.2,random_state=random_state)

model_to_set = OneVsRestClassifier(svm.SVC(kernel="linear"))

parameters = {
    "estimator__C": [1, 2, 4, 8, 16, 32],
    "estimator__kernel": ["linear", "rbf"],
    "estimator__gamma":[1, 0.1, 1e-2, 1e-3, 1e-4],
}

model_tuning = GridSearchCV(model_to_set, param_grid=parameters)
model_tuning.fit(X_train, y_train)

print model_tuning.best_score_
print model_tuning.best_params_
print "Time passed: ", "{0:.1f}".format(time.time()-t), "sec"
Run Code Online (Sandbox Code Playgroud)

lej*_*lot 5

GridSearchCV中有一个n_job参数

n_jobs:整数,默认值= 1

要并行运行的作业数。在版本0.17中进行了更改:升级到joblib 0.9.3。