找到大于等级的值 - Python Pandas

Question

找到大于等级的值 - Python Pandas

在一个时间序列(有序元组)中,找到第一次满足标准的最有效方法是什么？

特别是,对于pandas数据框中的列值,确定值何时超过100的最有效方法是什么？

我希望有一个聪明的矢量化解决方案,而不必使用df.iterrows().

例如,对于价格或计数数据,当值超过100.即df ['col']> 100.

              price
date 
2005-01-01     98
2005-01-02     99
2005-01-03     100
2005-01-04     99
2005-01-05     98
2005-01-06     100
2005-01-07     100
2005-01-08     98

Run Code Online (Sandbox Code Playgroud)

但对于潜在的非常大的系列.是迭代(慢)还是有矢量化解决方案更好？

一个df.iterrows()解决办法是:

for row, ind in df.iterrows():
    if row['col'] > value_to_check:
        breakpoint = row['value_to_record'].loc[ind]
        return breakpoint
return None

Run Code Online (Sandbox Code Playgroud)

但我的问题更多的是关于效率(可能是一个可以很好地扩展的矢量化解决方案).

Answer 1

Mer*_*lin 9

试试这个:"> 99"

df[df['price'].gt(99)].index[0]

Run Code Online (Sandbox Code Playgroud)

返回"2",第二个索引行.

所有行索引都大于99

df[df['price'].gt(99)].index
Int64Index([2, 5, 6], dtype='int64')

Run Code Online (Sandbox Code Playgroud)

我不认为这回答了问题的核心，他们在问：是否有一个类似于迭代器的矢量化 numpy 操作，以便它会懒惰地返回值（我们只关心第一个）而不是迭代返回之前的整个数组 (2认同)

Answer 2

use*_*496 5

这将返回系列中第一次出现 100 的索引值：

 index_value = (df['col'] - 100).apply(abs).idxmin()

Run Code Online (Sandbox Code Playgroud)

如果没有恰好为 100 的值，则应返回最接近值的索引。

归档时间：	9 年，5 月前
查看次数：	9379 次
最近记录：	8 年，3 月前