在python中使用以下代码用于svm:
from sklearn import datasets
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf = OneVsRestClassifier(SVC(kernel='linear', probability=True, class_weight='auto'))
clf.fit(X, y)
proba = clf.predict_proba(X)
Run Code Online (Sandbox Code Playgroud)
但这需要花费大量时间.
实际数据维度:
train-set (1422392,29)
test-set (233081,29)
Run Code Online (Sandbox Code Playgroud)
我怎样才能加快速度(平行或其他方式)?请帮忙.我已经尝试过PCA和下采样.
我有6节课.编辑:发现http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html 但我希望进行概率估计,而且对于svm来说似乎并非如此.
编辑:
from sklearn import datasets
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC,LinearSVC
from sklearn.linear_model import SGDClassifier
import joblib
import numpy as np
from sklearn import grid_search
import multiprocessing
import numpy as np
import math
def …Run Code Online (Sandbox Code Playgroud)