Art*_*nov 2 python dataframe pandas
我有熊猫df['realize']
time realize
2016-01-18 08:25:00 -46.369083
2016-01-19 14:30:00 -819.010738
2016-01-20 11:10:00 -424.955847
2016-01-21 07:15:00 27.523859
2016-01-21 16:10:00 898.522762
2016-01-25 00:00:00 761.063545
Run Code Online (Sandbox Code Playgroud)
哪里time:
df.index = df['time']
df.index = pd.to_datetime(df.index)
Run Code Online (Sandbox Code Playgroud)
哪里df['realize']:
In: type(df['realize'])
Out: pandas.core.series.Series
Run Code Online (Sandbox Code Playgroud)
我想计算连续值,规则很简单 ( df['realize'] > 0, df['realize'] < 0)
预计输出:
time realize Consecutive
2016-01-18 08:25:00 -46.369083 1
2016-01-19 14:30:00 -819.010738 2
2016-01-20 11:10:00 -424.955847 3
2016-01-21 07:15:00 27.523859 1
2016-01-21 16:10:00 898.522762 2
2016-01-25 00:00:00 761.063545 3
Run Code Online (Sandbox Code Playgroud)
我阅读了有关循环的主题,但没有找到我需要的内容。预先感谢您的帮助。
您可以执行以下操作:
g = df.realize.gt(0).astype(int).diff().fillna(0).abs().cumsum()
df['Consecutive'] = df.groupby(g).realize.cumcount().add(1)
time realize Consecutive
0 2016-01-18 08:25:00 -46.369083 1
1 2016-01-19 14:30:00 -819.010738 2
2 2016-01-20 11:10:00 -424.955847 3
3 2016-01-21 07:15:00 27.523859 1
4 2016-01-21 16:10:00 898.522762 2
5 2016-01-25 00:00:00 761.063545 3
Run Code Online (Sandbox Code Playgroud)
其中使用的石斑鱼是通过取DataFrame.diff布尔系列的一阶差分 ( ) 来获得的,指示是否realize大于0:
diff = df.realize.gt(0).astype(int).diff().fillna(0).abs()
df.assign(diff = diff, grouper = g)
time realize Consecutive diff grouper
0 2016-01-18 08:25:00 -46.369083 1 0.0 0.0
1 2016-01-19 14:30:00 -819.010738 2 0.0 0.0
2 2016-01-20 11:10:00 -424.955847 3 0.0 0.0
3 2016-01-21 07:15:00 27.523859 1 1.0 1.0
4 2016-01-21 16:10:00 898.522762 2 0.0 1.0
5 2016-01-25 00:00:00 761.063545 3 0.0 1.0
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2135 次 |
| 最近记录: |