尽管没有一维数组,但在Sklearn中获取1d数组的弃用警告

Question

尽管没有一维数组,但在Sklearn中获取1d数组的弃用警告

我正在尝试使用SKLearn来运行SVM模型.我现在只是尝试一些样本数据.这是数据和代码:

import numpy as np
from sklearn import svm
import random as random

A = np.array([[random.randint(0, 20) for i in range(2)] for i in range(10)])
lab = [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]

clf = svm.SVC(kernel='linear', C=1.0)
clf.fit(A, lab)

Run Code Online (Sandbox Code Playgroud)

仅供参考,我跑的时候

import sklearn
sklearn.__version__

Run Code Online (Sandbox Code Playgroud)

它输出0.17.

现在,当我跑步时print(clf.predict([1, 1])),我收到以下警告:

C:\Users\me\AppData\Local\Continuum\Anaconda2\lib\site-packages\sklearn\ut
ils\validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecat
ed in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.re
shape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contain
s a single sample.
  DeprecationWarning)

Run Code Online (Sandbox Code Playgroud)

它确实给了我一个预测,这是伟大的.但是,由于一些原因,我觉得这很奇怪.

我没有1d数组.如果你打印A,你得到

array([[ 9, 12],
       [ 2, 16],
       [14, 14],
       [ 4,  2],
       [ 8,  4],
       [12,  3],
       [ 0,  0],
       [ 3, 13],
       [15, 17],
       [15, 16]])

Run Code Online (Sandbox Code Playgroud)

在我看来,这是2维.但是,好吧,我只想说我所拥有的实际上是一维数组.让我们尝试使用reshape错误建议的更改它.

与上面相同的代码,但现在我们有

A = np.array([[random.randint(0, 20) for i in range(2)] for i in range(10)]).reshape(-1,1)

Run Code Online (Sandbox Code Playgroud)

但是这会输出一个长度为20的数组,这没有任何意义,也不是我想要的.我也尝试过,reshape(1, -1)但是这给了我一个包含20个项目的观察/列表.

如何在numpy数组中重塑我的数据,以便我不会收到此警告？

我在SO上看了两个答案,但对我来说都没有用.问题1和问题2.似乎Q1实际上是1D数据并且使用了解决方案reshape,我试过并且失败了.Q2有一个关于如何跟踪警告和错误的答案,这不是我想要的.另一个答案是一维数组的实例.

Answer 1

jua*_*aga 20

错误来自预测方法.Numpy将[1,1]解释为1d数组.所以这应该避免警告:

clf.predict(np.array([[1,1]]))

请注意:

In [14]: p1 = np.array([1,1])

In [15]: p1.shape
Out[15]: (2,)

In [16]: p2 = np.array([[1,1]])

In [17]: p2.shape
Out[17]: (1, 2)

Run Code Online (Sandbox Code Playgroud)

另请注意,您不能使用形状数组(2,1)

In [21]: p3 = np.array([[1],[1]])

In [22]: p3.shape
Out[22]: (2, 1)

In [23]: clf.predict(p3)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-23-e4070c037d78> in <module>()
----> 1 clf.predict(p3)

/home/juan/anaconda3/lib/python3.5/site-packages/sklearn/svm/base.py in predict(self, X)
    566             Class labels for samples in X.
    567         """
--> 568         y = super(BaseSVC, self).predict(X)
    569         return self.classes_.take(np.asarray(y, dtype=np.intp))
    570 

/home/juan/anaconda3/lib/python3.5/site-packages/sklearn/svm/base.py in predict(self, X)
    303         y_pred : array, shape (n_samples,)
    304         """
--> 305         X = self._validate_for_predict(X)
    306         predict = self._sparse_predict if self._sparse else self._dense_predict
    307         return predict(X)

/home/juan/anaconda3/lib/python3.5/site-packages/sklearn/svm/base.py in _validate_for_predict(self, X)
    472             raise ValueError("X.shape[1] = %d should be equal to %d, "
    473                              "the number of features at training time" %
--> 474                              (n_features, self.shape_fit_[1]))
    475         return X
    476 

ValueError: X.shape[1] = 1 should be equal to 2, the number of features at training time

Run Code Online (Sandbox Code Playgroud)

Answer 2

Flo*_*oor 5

而不是跑步

print(clf.predict([1, 1]))

Run Code Online (Sandbox Code Playgroud)

跑

print(clf.predict([[1,1]])

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，7 月前
查看次数：	8075 次
最近记录：	8 年，8 月前