我有一个看起来像这样的DataFrame:
FinancialYearStart MonthOfFinancialYear SalesTotal
0 2015 1 10
1 2015 2 10
2 2015 5 10
3 2015 6 50
4 2016 1 10
5 2016 3 20
6 2016 2 30
7 2017 6 70
8 2017 7 80
Run Code Online (Sandbox Code Playgroud)
我想计算每个月的年初至今销售总额,生成一个如下表:
FinancialYearStart MonthOfFinancialYear SalesTotal YTDTotal
0 2015 1 10 10
1 2015 2 10 20
2 2015 5 10 30
3 2015 6 50 50
4 2016 1 10 60
5 2016 3 20 80
6 2016 2 30 110
7 2017 6 70 70
8 2017 7 80 150
Run Code Online (Sandbox Code Playgroud)
我该如何实现?
更具体地说,我实际上需要逐组计算。
例如:
Year Month Customer TotalMonthlySales
2015 1 Dog 10
2015 2 Dog 10
2015 3 Cat 20
2015 4 Dog 30
2015 5 Cat 10
2015 7 Cat 20
2015 7 Dog 10
2016 1 Dog 40
2016 2 Dog 20
2016 3 Cat 70
2016 4 Dog 30
2016 5 Cat 10
2016 6 Cat 20
2016 7 Dog 10
Run Code Online (Sandbox Code Playgroud)
将给出:
Year Month Customer TotalMonthlySales YTDSales
2015 1 Dog 10 10
2015 2 Dog 10 20
2015 3 Cat 20 20
2015 4 Dog 30 50
2015 5 Cat 10 30
2015 7 Cat 20 40
2015 7 Dog 10 60
2016 1 Dog 40 40
2016 2 Dog 20 60
2016 3 Cat 70 70
2016 4 Dog 30 90
2016 5 Cat 10 80
2016 6 Cat 20 100
2016 7 Dog 10 100
Run Code Online (Sandbox Code Playgroud)
df['YTDSales'] = df.groupby(['Year','Customer'])['TotalMonthlySales'].cumsum()
print (df)
Year Month Customer TotalMonthlySales YTDSales
0 2015 1 Dog 10 10
1 2015 2 Dog 10 20
2 2015 3 Cat 20 20
3 2015 4 Dog 30 50
4 2015 5 Cat 10 30
5 2015 7 Cat 20 50
6 2015 7 Dog 10 60
7 2016 1 Dog 40 40
8 2016 2 Dog 20 60
9 2016 3 Cat 70 70
10 2016 4 Dog 30 90
11 2016 5 Cat 10 80
12 2016 6 Cat 20 100
13 2016 7 Dog 10 100
Run Code Online (Sandbox Code Playgroud)
首先:
df['YTDTotal'] = df.groupby('FinancialYearStart')['SalesTotal'].cumsum()
print (df)
FinancialYearStart MonthOfFinancialYear SalesTotal YTDTotal
0 2015 1 10 10
1 2015 2 10 20
2 2015 5 10 30
3 2015 6 50 80
4 2016 1 10 10
5 2016 3 20 30
6 2016 2 30 60
7 2017 6 70 70
8 2017 7 80 150
Run Code Online (Sandbox Code Playgroud)