小编Ame*_*dav的帖子

ValueError:此解算器需要数据中至少2个类的样本,但数据只包含一个类:0.0

在将数据集拆分为测试和训练集后,我已在列车集上应用Logistic回归,但我得到了上述错误.我试图解决它,当我试图在控制台中打印我的响应向量y_train时,它会打印整数值,如0或1.但当我将其写入文件时,我发现值是浮点数,如0.0和1.0.如果那就是问题,我怎么能过来呢.

lenreg = LogisticRegression()

print y_train[0:10]
y_train.to_csv(path='ytard.csv')

lenreg.fit(X_train, y_train)
y_pred = lenreg.predict(X_test)
print metics.accuracy_score(y_test, y_pred)
Run Code Online (Sandbox Code Playgroud)

StrackTrace如下,

Traceback (most recent call last):

  File "/home/amey/prog/pd.py", line 82, in <module>

    lenreg.fit(X_train, y_train)

  File "/usr/lib/python2.7/dist-packages/sklearn/linear_model/logistic.py", line 1154, in fit

    self.max_iter, self.tol, self.random_state)

  File "/usr/lib/python2.7/dist-packages/sklearn/svm/base.py", line 885, in _fit_liblinear

    " class: %r" % classes_[0])

ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 0.0
Run Code Online (Sandbox Code Playgroud)

与此同时,我遇到了无法回答的链接.有解决方案吗?

python-2.7 scikit-learn logistic-regression sklearn-pandas

6
推荐指数
2
解决办法
9334
查看次数

使用Pandas数据帧不相交组进行随机抽样

我需要通过属性将数据框随机分成两个不相交的集合'ids'.例如,请考虑以下数据框:

df=
Out[470]: 
          0     1     2     3       ids
0      17.0  18.0  16.0  15.0      13.0
1      18.0  16.0  15.0  15.0      13.0
2      16.0  15.0  15.0  16.0      13.0
131    12.0   8.0  21.0  19.0      14.0
132     8.0  21.0  19.0  20.0      14.0
133    21.0  19.0  20.0   9.0      14.0
248     NaN   NaN  12.0  11.0      17.0
249     NaN  12.0  11.0  10.0      17.0
250    12.0  11.0  10.0   NaN      17.0
287     3.0   3.0   1.0   8.0      20.0
288     3.0   1.0   8.0   3.0      20.0
289     1.0   8.0   3.0 …
Run Code Online (Sandbox Code Playgroud)

python disjoint-sets pandas

4
推荐指数
1
解决办法
628
查看次数