我在分组数据框上计算了一个滚动总和,但它以错误的方式加起来,当我需要过去的总和时,它是未来的总和。
我在这里做错了什么?
我导入数据并按维度和日期排序(我已经尝试删除日期排序)
df = pd.read_csv('Input.csv', parse_dates=True)
df.sort_values(['Dimension','Date'])
print(df)
Run Code Online (Sandbox Code Playgroud)
然后我创建一个新列,它是按滚动窗口分组的多索引
new_column = df.groupby('Dimension').Value1.apply(lambda x:
x.rolling(window=3).sum())
Run Code Online (Sandbox Code Playgroud)
然后我将索引重置为与原始索引相同
df['Sum_Value1'] = new_column.reset_index(level=0, drop=True)
print(df)
Run Code Online (Sandbox Code Playgroud)
我也试过在计算前反转指数,但也失败了。
输入
Dimension,Date,Value1,Value2
1,4/30/2002,10,20
1,1/31/2002,10,20
1,10/31/2001,10,20
1,7/31/2001,10,20
1,4/30/2001,10,20
1,1/31/2001,10,20
1,10/31/2000,10,20
2,4/30/2002,10,20
2,1/31/2002,10,20
2,10/31/2001,10,20
2,7/31/2001,10,20
2,4/30/2001,10,20
2,1/31/2001,10,20
2,10/31/2000,10,20
3,4/30/2002,10,20
3,1/31/2002,10,20
3,10/31/2001,10,20
3,7/31/2001,10,20
3,1/31/2001,10,20
3,10/31/2000,10,20
Run Code Online (Sandbox Code Playgroud)
输出:
Dimension Date Value1 Value2 Sum_Value1
0 1 4/30/2002 10 20 NaN
1 1 1/31/2002 10 20 NaN
2 1 10/31/2001 10 20 30.0
3 1 7/31/2001 10 20 30.0
4 1 4/30/2001 10 20 30.0
5 1 1/31/2001 10 20 30.0
6 1 10/31/2000 10 20 30.0
7 2 4/30/2002 10 20 NaN
8 2 1/31/2002 10 20 NaN
9 2 10/31/2001 10 20 30.0
10 2 7/31/2001 10 20 30.0
11 2 4/30/2001 10 20 30.0
12 2 1/31/2001 10 20 30.0
13 2 10/31/2000 10 20 30.0
Run Code Online (Sandbox Code Playgroud)
目标输出:
Dimension Date Value1 Value2 Sum_Value1
0 1 4/30/2002 10 20 30.0
1 1 1/31/2002 10 20 30.0
2 1 10/31/2001 10 20 30.0
3 1 7/31/2001 10 20 30.0
4 1 4/30/2001 10 20 30.0
5 1 1/31/2001 10 20 NaN
6 1 10/31/2000 10 20 NaN
7 2 4/30/2002 10 20 30.0
8 2 1/31/2002 10 20 30.0
9 2 10/31/2001 10 20 30.0
10 2 7/31/2001 10 20 30.0
11 2 4/30/2001 10 20 30.0
12 2 1/31/2001 10 20 Nan
13 2 10/31/2000 10 20 NaN
Run Code Online (Sandbox Code Playgroud)
您可以将结果移动以window-1获得左对齐的结果:
df["sum_value1"] = (df.groupby('Dimension').Value1
.apply(lambda x: x.rolling(window=3).sum().shift(-2)))
Run Code Online (Sandbox Code Playgroud)
小智 6
向后滚动与向前滚动然后移动结果相同:
x.rolling(window=3).sum().shift(-2)
Run Code Online (Sandbox Code Playgroud)
您需要一个反向总和,因此在总和滚动之前反转您的系列:
lambda x: x[::-1].rolling(window=3).sum()
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
8828 次 |
| 最近记录: |