bos*_*elo 5 python dataframe pandas pandas-groupby
I have a df that contains multiple weekly snapshots of JIRA tickets. I want to calculate the YTD counts of tickets.
the df looks like this:
pointInTime ticketId
2008-01-01 111
2008-01-01 222
2008-01-01 333
2008-01-07 444
2008-01-07 555
2008-01-07 666
2008-01-14 777
2008-01-14 888
2008-01-14 999
Run Code Online (Sandbox Code Playgroud)
So if I df.groupby(['pointInTime'])['ticketId'].count() I can get the count of Ids in every snaphsots. But what I want to achieve is calculate the cumulative sum.
and have a df looks like this:
pointInTime ticketId cumCount
2008-01-01 111 3
2008-01-01 222 3
2008-01-01 333 3
2008-01-07 444 6
2008-01-07 555 6
2008-01-07 666 6
2008-01-14 777 9
2008-01-14 888 9
2008-01-14 999 9
Run Code Online (Sandbox Code Playgroud)
so for 2008-01-07 number of ticket would be count of 2008-01-07 + count of 2008-01-01.
Use GroupBy.count and cumsum, then map the result back to "pointInTime":
df['cumCount'] = (
df['pointInTime'].map(df.groupby('pointInTime')['ticketId'].count().cumsum()))
df
pointInTime ticketId cumCount
0 2008-01-01 111 3
1 2008-01-01 222 3
2 2008-01-01 333 3
3 2008-01-07 444 6
4 2008-01-07 555 6
5 2008-01-07 666 6
6 2008-01-14 777 9
7 2008-01-14 888 9
8 2008-01-14 999 9
Run Code Online (Sandbox Code Playgroud)