如何修复python中的“数据必须是一维”异常

Sac*_*mar 3 dataframe pandas

我正在尝试创建一个数据集来检查我的 Logistic 回归算法,但我无法从字典中创建一个 Pandas DataFrame。我收到“数据必须是一维的”异常。

    x1 = np.random.random(size=(10,1))*2
    x2 = np.random.random(size=(10,1))*2

    x3 = np.random.random(size=(10,1))*2 + 2
    x4 = np.random.random(size=(10,1))*2 + 2

    y0 = np.zeros(shape=(10,1))
    y1 = np.ones(shape=(10,1))

    plt.scatter(x1,x2, color='g', marker='o')
    plt.scatter(x3,x4, color='r', marker='o')

    dict_data = { 'X1':np.concatenate((x1,x3)), 
                  'X2':np.concatenate((x2,x4)),
                   'Y':np.concatenate((y0,y1))}

    data = pd.DataFrame(dict_data, index=np.arange(20))
Run Code Online (Sandbox Code Playgroud)

我得到这个作为输出,错误数据必须是 1 维。

    --------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-49-fe81f079ebc6> in <module>
     13 dict_data = { 'X1':np.concatenate((x1,x3)), 'X2':np.concatenate((x2,x4)),'Y':np.concatenate((y0,y1))}
     14 #print(dict_data.shape)
---> 15 data = pd.DataFrame(dict_data, index=np.arange(20).reshape(20))

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    328                                  dtype=dtype, copy=copy)
    329         elif isinstance(data, dict):
--> 330             mgr = self._init_dict(data, index, columns, dtype=dtype)
    331         elif isinstance(data, ma.MaskedArray):
    332             import numpy.ma.mrecords as mrecords

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _init_dict(self, data, index, columns, dtype)
    459             arrays = [data[k] for k in keys]
    460 
--> 461         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    462 
    463     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
   6166 
   6167     # don't force copy because getting jammed in an ndarray anyway
-> 6168     arrays = _homogenize(arrays, index, dtype)
   6169 
   6170     # from BlockManager perspective

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _homogenize(data, index, dtype)
   6475                 v = lib.fast_multiget(v, oindex.values, default=np.nan)
   6476             v = _sanitize_array(v, index, dtype=dtype, copy=False,
-> 6477                                 raise_cast_failure=False)
   6478 
   6479         homogenized.append(v)

~/anaconda3/lib/python3.6/site-packages/pandas/core/series.py in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
   3273     elif subarr.ndim > 1:
   3274         if isinstance(data, np.ndarray):
-> 3275             raise Exception('Data must be 1-dimensional')
   3276         else:
   3277             subarr = _asarray_tuplesafe(data, dtype=dtype)

Exception: Data must be 1-dimensional
Run Code Online (Sandbox Code Playgroud)

Chr*_*ris 5

np.random.random(size=(10,1)) 生成形状为 (10, 1) 的二维数组,但是 pandas 将 DataFrame 构造为一维数组的集合。

所以np.random.random(size=(10))用来制作一维数组,然后可以用来制作DataFrame。