如何在 Pandas 中正确获取单个单元格:loc[index,column] VS get_value(index,column)

int*_*ael 2 python dataframe pandas

为了从熊猫中获取单个单元格,使用哪种方法更好(在性能和可靠性方面)DataFrame:get_value() 或 loc[]?

jez*_*ael 5

最后你可以在文档中找到信息:

用于显式获取值(相当于已弃用的 df.get_value('a','A'))

# this is also equivalent to ``df1.at['a','A']``
In [55]: df1.loc['a', 'A']
Out[55]: 0.13200317033032932
Run Code Online (Sandbox Code Playgroud)

但如果使用它没有警告。

但如果检查Index.get_value

从一维 ndarray 快速查找值。仅当您知道自己在做什么时才使用它

所以我认为更好的是使用iat, at, loc, ix

时间

df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3]})

print (df)

In [93]: %timeit (df.loc[0, 'A'])
The slowest run took 6.40 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 177 µs per loop

In [96]: %timeit (df.at[0, 'A'])
The slowest run took 17.01 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.61 µs per loop

In [94]: %timeit (df.get_value(0, 'A'))
The slowest run took 23.49 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.36 µs per loop
Run Code Online (Sandbox Code Playgroud)