如何将Numpy数组转换为Panda DataFrame

Question

如何将Numpy数组转换为Panda DataFrame

Yan*_*ick 4 python numpy type-conversion pandas numpy-ndarray

我有一个看起来像这样的Numpy数组：

[400.31865662]
[401.18514808]
[404.84015554]
[405.14682194]
[405.67735105]
[273.90969447]
[274.0894528]

Run Code Online (Sandbox Code Playgroud)

当我尝试使用以下代码将其转换为Panda Dataframe时

y = pd.DataFrame(data)
print(y)

Run Code Online (Sandbox Code Playgroud)

打印时得到以下输出。为什么我得到所有这些zéros？

            0
0  400.318657
            0
0  401.185148
            0
0  404.840156
            0
0  405.146822
            0
0  405.677351
            0
0  273.909694
            0
0  274.089453

Run Code Online (Sandbox Code Playgroud)

我想要一个看起来像这样的单列数据框：

400.31865662
401.18514808
404.84015554
405.14682194
405.67735105
273.90969447
274.0894528

Run Code Online (Sandbox Code Playgroud)

Answer 1

Nic*_*ais 15

由于我认为这篇文章的许多访问者不是为了 OP 的特定和不可重现的问题而来到这里的，这里是一个一般性的答案：

df = pd.DataFrame(array)

Run Code Online (Sandbox Code Playgroud)

的优点pandas是美观（如 Excel），因此使用列名很重要。

import numpy as np
import pandas as pd

array = np.random.rand(5, 5)

Run Code Online (Sandbox Code Playgroud)

array([[0.723, 0.177, 0.659, 0.573, 0.476],
       [0.77 , 0.311, 0.533, 0.415, 0.552],
       [0.349, 0.768, 0.859, 0.273, 0.425],
       [0.367, 0.601, 0.875, 0.109, 0.398],
       [0.452, 0.836, 0.31 , 0.727, 0.303]])

Run Code Online (Sandbox Code Playgroud)

columns = [f'col_{num}' for num in range(5)]
index = [f'index_{num}' for num in range(5)]

Run Code Online (Sandbox Code Playgroud)

这就是魔法发生的地方：

df = pd.DataFrame(array, columns=columns, index=index)

Run Code Online (Sandbox Code Playgroud)

            col_0     col_1     col_2     col_3     col_4
index_0  0.722791  0.177427  0.659204  0.572826  0.476485
index_1  0.770118  0.311444  0.532899  0.415371  0.551828
index_2  0.348923  0.768362  0.858841  0.273221  0.424684
index_3  0.366940  0.600784  0.875214  0.108818  0.397671
index_4  0.451682  0.836315  0.310480  0.727409  0.302597

Run Code Online (Sandbox Code Playgroud)

Answer 2

Dan*_*ejo 9

您可以展平 numpy数组：

import numpy as np
import pandas as pd

data = [[400.31865662],
        [401.18514808],
        [404.84015554],
        [405.14682194],
        [405.67735105],
        [273.90969447],
        [274.0894528]]

arr = np.array(data)

df = pd.DataFrame(data=arr.flatten())

print(df)

Run Code Online (Sandbox Code Playgroud)

输出量

            0
0  400.318657
1  401.185148
2  404.840156
3  405.146822
4  405.677351
5  273.909694
6  274.089453

Run Code Online (Sandbox Code Playgroud)

Answer 3

aks*_*k07 5

还有另一种方式，其他答案中没有提到。如果您有一个 NumPy 数组，它本质上是一个行向量（或列向量），即形状像(n, )，那么您可以执行以下操作：

# sample array
x = np.zeros((20))
# empty dataframe
df = pd.DataFrame()
# add the array to df as a column
df['column_name'] = x

Run Code Online (Sandbox Code Playgroud)

通过这种方式，您可以将多个数组添加为单独的列。

归档时间：	6 年，11 月前
查看次数：	19564 次
最近记录：	6 年，5 月前