如何在条件满足之前用N个行中的一些对条件行进行子集化，比我的代码更快？

Question

如何在条件满足之前用N个行中的一些对条件行进行子集化，比我的代码更快？

nrm*_*zmh 9 python subset slice conditional-statements pandas

由于我的数据集是时间序列，因此我有30个不同的数据帧，每个数据帧的行数都超过10,000。我想检查一下温度值低于40之前的趋势。

因此，我想在温度值低于40时对行进行子集化，并且还希望在温度值低于40之前对24行进行子集化。

我已经尝试了一些代码，下面工作的唯一代码。但是子集需要更长的时间（例如，一个数据帧超过10分钟）。因此，我的代码很糟糕。所以我想知道python中可以更快子集的代码。你们能帮我吗？

df=temperature_df.copy()
drop_temperature_df=pd.DataFrame()

# get the index during drop temperature
drop_temperature_index=np.array(df[df[temperature]<40].index)

# subset the data frame for 24 hours before drop temperature
for i,index in enumerate(drop_temperature_index):
    drop_temperature_df=drop_temperature_df.append(df.loc[index-24:index,:])

K['K_{}'.format(string)]=drop_temperature_df.copy() #save the subset data frame

Run Code Online (Sandbox Code Playgroud)

因此，像下面的数据一样，我在1/26/2018 0800的温度点低于40，所以，我想在40以下的温度子集使用之前的24行（1/25/2018 0800直到1/26/2018 0800）。

在此处输入图片说明

Answer 1

WeN*_*Ben 6

我认为您可以使用ffillwith limit，并找到notnull index，对数据帧进行切片

yourdf=df[df.temperature.where(df.temperature<40).bfill(limit=24).notnull()].copy()

Run Code Online (Sandbox Code Playgroud)

是。您的代码速度更快，我的所有数据帧只需花费一分钟。谢谢！ (2认同)

归档时间：	6 年，9 月前
查看次数：	116 次
最近记录：	6 年，9 月前