Gov*_*rai 13 python numpy dataframe pandas
我正在尝试从字典创建一个非常简单的 Pandas DataFrame。字典有 3 个项目,DataFrame 也是如此。他们是:
\n\xe2\x80\x8b
\n# from a dicitionary\n>>>dict1 = {"x": [1, 2, 3],\n... "y": list(\n... [\n... [2, 4, 6], \n... [3, 6, 9], \n... [4, 8, 12]\n... ]\n... ),\n... "z": 100}\n\n>>>df1 = pd.DataFrame(dict1)\n>>>df1\n x y z\n0 1 [2, 4, 6] 100\n1 2 [3, 6, 9] 100\n2 3 [4, 8, 12] 100\n
Run Code Online (Sandbox Code Playgroud)\ny
,并尝试从字典创建一个 DataFrame 。我尝试创建 DataFrame 的行出错了。下面是我尝试运行的代码以及我得到的错误(在单独的代码块中以便于阅读。)\xe2\x80\x8b
\n>>>dict2 = {"x": [1, 2, 3],\n... "y": np.array(\n... [\n... [2, 4, 6], \n... [3, 6, 9], \n... [4, 8, 12]\n... ]\n... ),\n... "z": 100}\n\n>>>df2 = pd.DataFrame(dict2) # see the below block for error\n
Run Code Online (Sandbox Code Playgroud)\n\xe2\x80\x8b
\n---------------------------------------------------------------------------\nValueError Traceback (most recent call last)\nd:\\studies\\compsci\\pyscripts\\study\\pandas-realpython\\data-delightful\\01.intro.ipynb Cell 10\' in <module>\n 1 # from a dicitionary\n 2 dict1 = {"x": [1, 2, 3],\n 3 "y": np.array(\n 4 [\n (...)\n 9 ),\n 10 "z": 100}\n---> 12 df1 = pd.DataFrame(dict1)\n\nFile ~\\anaconda3\\envs\\dst\\lib\\site-packages\\pandas\\core\\frame.py:636, in DataFrame.__init__(self, data, index, columns, dtype, copy)\n 630 mgr = self._init_mgr(\n 631 data, axes={"index": index, "columns": columns}, dtype=dtype, copy=copy\n 632 )\n 634 elif isinstance(data, dict):\n 635 # GH#38939 de facto copy defaults to False only in non-dict cases\n--> 636 mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)\n 637 elif isinstance(data, ma.MaskedArray):\n 638 import numpy.ma.mrecords as mrecords\n\nFile ~\\anaconda3\\envs\\dst\\lib\\site-packages\\pandas\\core\\internals\\construction.py:502, in dict_to_mgr(data, index, columns, dtype, typ, copy)\n 494 arrays = [\n 495 x\n 496 if not hasattr(x, "dtype") or not isinstance(x.dtype, ExtensionDtype)\n 497 else x.copy()\n 498 for x in arrays\n 499 ]\n 500 # TODO: can we get rid of the dt64tz special case above?\n--> 502 return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)\n\nFile ~\\anaconda3\\envs\\dst\\lib\\site-packages\\pandas\\core\\internals\\construction.py:120, in arrays_to_mgr(arrays, columns, index, dtype, verify_integrity, typ, consolidate)\n 117 if verify_integrity:\n 118 # figure out the index, if necessary\n 119 if index is None:\n--> 120 index = _extract_index(arrays)\n 121 else:\n 122 index = ensure_index(index)\n\nFile ~\\anaconda3\\envs\\dst\\lib\\site-packages\\pandas\\core\\internals\\construction.py:661, in _extract_index(data)\n 659 raw_lengths.append(len(val))\n 660 elif isinstance(val, np.ndarray) and val.ndim > 1:\n--> 661 raise ValueError("Per-column arrays must each be 1-dimensional")\n 663 if not indexes and not raw_lengths:\n 664 raise ValueError("If using all scalar values, you must pass an index")\n\nValueError: Per-column arrays must each be 1-dimensional\n
Run Code Online (Sandbox Code Playgroud)\n尽管两个数组的维度相同,为什么它会像第二次尝试那样以错误结束?此问题的解决方法是什么?
\nHam*_*zah 14
如果您仔细查看错误消息并快速查看此处的源代码:
elif isinstance(val, np.ndarray) and val.ndim > 1:
raise ValueError("Per-column arrays must each be 1-dimensional")
Run Code Online (Sandbox Code Playgroud)
您会发现,如果字典值是一个 numpy 数组并且具有多个维度(如您的示例所示),它会根据源代码抛出错误。因此,它与列表配合得很好,因为即使列表是列表的列表,列表也不会超过一维。
lst = [[1,2,3],[4,5,6],[7,8,9]]
len(lst) # print 3 elements or (3,) not (3,3) like numpy array.
Run Code Online (Sandbox Code Playgroud)
您可以尝试使用 np.array([1,2,3]),它会起作用,因为维度数为 1 并尝试:
arr = np.array([1,2,3])
print(arr.ndim) # output is 1
Run Code Online (Sandbox Code Playgroud)
如果需要在字典中使用 numpy 数组,可以使用.tolist()
将 numpy 数组转换为列表。
归档时间: |
|
查看次数: |
36269 次 |
最近记录: |