将2d阵列放入Pandas系列

zem*_*eng 9 python numpy pandas

我有一个2D Numpy数组,我想放入一个pandas系列(不是DataFrame):

>>> import pandas as pd
>>> import numpy as np
>>> a = np.zeros((5, 2))
>>> a
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])
Run Code Online (Sandbox Code Playgroud)

但这会引发错误:

>>> s = pd.Series(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/miniconda/envs/pyspark/lib/python3.4/site-packages/pandas/core/series.py", line 227, in __init__
    raise_cast_failure=True)
  File "/miniconda/envs/pyspark/lib/python3.4/site-packages/pandas/core/series.py", line 2920, in _sanitize_array
    raise Exception('Data must be 1-dimensional')
Exception: Data must be 1-dimensional
Run Code Online (Sandbox Code Playgroud)

有可能是黑客:

>>> s = pd.Series(map(lambda x:[x], a)).apply(lambda x:x[0])
>>> s
0    [0.0, 0.0]
1    [0.0, 0.0]
2    [0.0, 0.0]
3    [0.0, 0.0]
4    [0.0, 0.0]
Run Code Online (Sandbox Code Playgroud)

有没有更好的办法?

bpa*_*hev 8

好吧,你可以使用这个numpy.ndarray.tolist功能,如下:

>>> a = np.zeros((5,2))
>>> a
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])
>>> a.tolist()
[[0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0]]
>>> pd.Series(a.tolist())
0    [0.0, 0.0]
1    [0.0, 0.0]
2    [0.0, 0.0]
3    [0.0, 0.0]
4    [0.0, 0.0]
dtype: object
Run Code Online (Sandbox Code Playgroud)

编辑:

实现类似结果的更快方法就是简单地完成pd.Series(list(a)).这将生成一系列numpy数组而不是Python列表,因此应该比a.tolist返回Python列表列表更快.

  • 我找到了另一种更快的方法。请参阅编辑后的答案。 (4认同)