熊猫:如何创建一列来指示值,该值预先出现在另一列中时要有一定数量的行?

Bra*_*rad 7 python pandas

我正在尝试确定如何创建一个列来预先指示(X行)何时另一列值的下一次出现将与熊猫一起发生,而熊猫实际上执行以下功能(在这种情况下,X = 3):

df

rowid  event   indicator
1      True    1 # Event occurs
2      False   0
3      False   0
4      False   1 # Starts indicator
5      False   1
6      True    1 # Event occurs
7      False   0
Run Code Online (Sandbox Code Playgroud)

除了对每一行进行迭代/递归循环外:

i = df.index[df['event']==True]
dfx = [df.index[z-X:z] for z in i]
df['indicator'][dfx]=1
df['indicator'].fillna(0)
Run Code Online (Sandbox Code Playgroud)

但是,这似乎效率低下,是否有更简洁的方法来实现上述示例?谢谢

yat*_*atu 1

这是NumPy使用flatnonzero 的基于方法:

X = 3
# ndarray of indices where indicator should be set to one
nd_ixs = np.flatnonzero(df.event)[:,None] - np.arange(X-1, -1, -1)
# flatten the indices
ixs = nd_ixs.ravel()
# filter out negative indices an set to 1
df['indicator'] = 0
df.loc[ixs[ixs>=0], 'indicator'] = 1
Run Code Online (Sandbox Code Playgroud)
print(df)

    rowid  event  indicator
0      1   True          1
1      2  False          0
2      3  False          0
3      4  False          1
4      5  False          1
5      6   True          1
6      7  False          0
Run Code Online (Sandbox Code Playgroud)

其中nd_ixs是通过广播减去索引 where eventisTrue和最大范围来获得X

print(nd_ixs)

array([[-2, -1,  0],
       [ 3,  4,  5]], dtype=int64)
Run Code Online (Sandbox Code Playgroud)