Pandas indexing and Key error

Question

Pandas indexing and Key error

Consider the following:

d = {'a': 0.0, 'b': 1.0, 'c': 2.0}

e = pd.Series(d, index = ['a', 'b', 'c'])

df = pd.DataFrame({ 'A' : 1.,'B' : e,'C' :pd.Timestamp('20130102')}).

Run Code Online (Sandbox Code Playgroud)

当我尝试通过以下方式访问 B 列的第一行时：

>>> df.B[0]
0.0

Run Code Online (Sandbox Code Playgroud)

我得到正确的结果。

但是，在阅读KeyError: 0 when accessing value in pandas series 之后，我假设，因为我已将索引指定为 'a'、'b' 和 'c'，这是访问列第一行的正确方法B（使用位置参数）是: df.B.iloc[0] ，并且df.B[0]应该引发一个关键错误。我不知道我错过了什么。有人可以澄清在哪种情况下我会收到 Key Error 吗？

Answer 1

Jus*_*zas 7

您引用的问题中的问题是给定数据帧的索引是整数，但不是从 0 开始。

Pandas 请求时的行为df.B[0]不明确，取决于传递给 python 切片语法的索引的数据类型和值的数据类型。它可以表现得像df.B.loc[0]（基于索引标签）或df.B.iloc[0]（基于位置）或者可能是我不知道的其他东西。对于可预测的行为，我建议使用loc和iloc。

用你的例子来说明这一点：

d = [0.0, 1.0, 2.0]
e = pd.Series(d, index = ['a', 'b', 'c'])
df = pd.DataFrame({'A': 1., 'B': e, 'C': pd.Timestamp('20130102')})

df.B[0] # 0.0 - fall back to position based
df.B['0'] # KeyError - no label '0' in index
df.B['a'] # 0.0 - found label 'a' in index
df.B.loc[0] # TypeError - string index queried by integer value
df.B.loc['0'] # KeyError - no label '0' in index
df.B.loc['a'] # 0.0 - found label 'a' in index
df.B.iloc[0] # 0.0 - position based query for row 0
df.B.iloc['0'] # TypeError - string can't be used for position
df.B.iloc['a'] # TypeError - string can't be used for position

Run Code Online (Sandbox Code Playgroud)

以参考文章中的示例为例：

d = [0.0, 1.0, 2.0]
e = pd.Series(d, index = [4, 5, 6])
df = pd.DataFrame({'A': 1., 'B': e, 'C': pd.Timestamp('20130102')})

df.B[0] # KeyError - label 0 not in index
df.B['0'] # KeyError - label '0' not in index
df.B.loc[0] # KeyError - label 0 not in index
df.B.loc['0'] # KeyError - label '0' not in index
df.B.iloc[0] # 0.0 - position based query for row 0
df.B.iloc['0'] # TypeError - string can't be used for position

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，6 月前
查看次数：	23280 次
最近记录：	7 年，6 月前