为什么我的熊猫数据框选择的形状是错误的

Question

为什么我的熊猫数据框选择的形状是错误的

Seb*_*bMa 5 python shape slice dataframe pandas

我有一个名为dfwhere df.shapeis (53, 80)where 索引和列都存在的 Pandas DataFrame int。

如果我像这样选择第一行，我会得到：

df.loc[0].shape
(80,)

Run Code Online (Sandbox Code Playgroud)

代替：

(1,80)

Run Code Online (Sandbox Code Playgroud)

但是然后df.loc[0:0].shape或df[0:1].shape两者都显示正确的形状。

Answer 1

jpp*_*jpp 5

df.loc[0]返回一个一维 pd.Series对象，表示通过索引提取的单行数据。

df.loc[0:0]返回一个二维 pd.DataFrame对象，表示数据框中的一行数据，通过切片提取。

如果打印这些操作的结果，您可以更清楚地看到这一点：

import pandas as pd, numpy as np

df = pd.DataFrame(np.arange(9).reshape(3, 3))

res1 = df.loc[0]
res2 = df.loc[0:0]

print(type(res1), res1, sep='\n')

<class 'pandas.core.series.Series'>
0    0
1    1
2    2
Name: 0, dtype: int32

print(type(res2), res2, sep='\n')

<class 'pandas.core.frame.DataFrame'>
   0  1  2
0  0  1  2

Run Code Online (Sandbox Code Playgroud)

该约定遵循 NumPy 索引/切片。这是很自然的，因为 Pandas 是基于 NumPy 数组构建的。

arr = np.arange(9).reshape(3, 3)

print(arr[0].shape)    # (3,), i.e. 1-dimensional
print(arr[0:0].shape)  # (0, 3), i.e. 2-dimensional

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，8 月前
查看次数：	2898 次
最近记录：	5 年，11 月前