当我使用以下选项训练scikit-learn v0.15SGDClassifier时:SGDClassifier(loss='log', class_weight=None, penalty='l2')训练完成且没有错误.然而,当我用class_weight='auto'scikit-learn v0.15 训练这个分类器时,我得到了这个错误:
return self.model.fit(X, y)
File "/home/rose/.local/lib/python2.7/site-packages/scikit_learn-0.15.0b1-py2.7-linux-x86_64.egg/sklearn/linear_model/stochastic_gradient.py", line 485, in fit
sample_weight=sample_weight)
File "/home/rose/.local/lib/python2.7/site-packages/scikit_learn-0.15.0b1-py2.7-linux-x86_64.egg/sklearn/linear_model/stochastic_gradient.py", line 389, in _fit
classes, sample_weight, coef_init, intercept_init)
File "/home/rose/.local/lib/python2.7/site-packages/scikit_learn-0.15.0b1-py2.7-linux-x86_64.egg/sklearn/linear_model/stochastic_gradient.py", line 336, in _partial_fit
y_ind)
File "/home/rose/.local/lib/python2.7/site-packages/scikit_learn-0.15.0b1-py2.7-linux-x86_64.egg/sklearn/utils/class_weight.py", line 43, in compute_class_weight
raise ValueError("classes should have valid labels that are in y")
ValueError: classes should have valid labels that are in y
Run Code Online (Sandbox Code Playgroud)
什么可能导致它?
供参考,这里的文档是class_weight:
为class_weight fit参数预设.与类相关的权重.如果没有给出,所有课程都应该有一个重量."自动"模式使用y的值自动调整与类频率成反比的权重.
我认为这可能是 scikit-learn 中的一个错误。作为解决方法,请尝试以下操作:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y_encoded = le.fit_transform(y)
self.model.fit(X, y_encoded)
pred = le.inverse_transform(self.model.predict(X))
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
650 次 |
| 最近记录: |