按条件行标准对数据帧列求和

Dav*_*way 2 python dataframe pandas

如何根据现有行中的动态值汇总列?使用下面的示例,我想迭代每一行 ( ),计算所有wherex的总和,并将总计添加为新列。ClicksDate == x.Date_Yesterday

输入数据:

df = pd.DataFrame({
    'Date': ['2021-09-14','2021-09-14','2021-09-14','2021-09-13','2021-09-12','2021-09-12','2021-09-11'],
    'Date_Yesterday': ['2021-09-13','2021-09-13','2021-09-13','2021-09-12','2021-09-11','2021-09-11','2021-09-10'],
    'Clicks': [100,100,100,50,10,10,1]
})


   Date           Date_Yesterday  Clicks
0  2021-09-14     2021-09-13      100
1  2021-09-14     2021-09-13      100
2  2021-09-14     2021-09-13      100
3  2021-09-13     2021-09-12      50
4  2021-09-12     2021-09-11      10
5  2021-09-12     2021-09-11      10
6  2021-09-11     2021-09-10      1
Run Code Online (Sandbox Code Playgroud)

所需的输出数据:

Date          Date_Yesterday   Clicks  Total_Clicks_Yesterday
2021-09-14    2021-09-13       100     50
2021-09-14    2021-09-13       100     50
2021-09-14    2021-09-13       100     50
2021-09-13    2021-09-12       50      20
2021-09-12    2021-09-11       10      1
2021-09-12    2021-09-11       10      1
2021-09-11    2021-09-10       1       N/A
Run Code Online (Sandbox Code Playgroud)

Total_Clicks_Yesterday使用静态值计算很简单:

clicks_yesterday = df['Total_Clicks_Yesterday'] = df.loc[df['Date'] == '2021-09-13', 'Clicks'].sum()
print(clicks_yesterday)

         Date Date_Yesterday  Clicks  Total_Clicks_Yesterday
0  2021-09-14     2021-09-13     100                      50
1  2021-09-14     2021-09-13     100                      50
2  2021-09-14     2021-09-13     100                      50
3  2021-09-13     2021-09-12      50                      50
4  2021-09-12     2021-09-11      10                      50
5  2021-09-12     2021-09-11      10                      50
6  2021-09-11     2021-09-10       1                      50
Run Code Online (Sandbox Code Playgroud)

但我不确定如何使其对于每个订单项都是动态的?

Ben*_*n.T 7

您可以groupby通过“日期”和sum“点击次数”列来获取每天的点击次数。然后在 Date_yesterday 列上使用map操作结果,groupby将点击数与前一天对齐

df['Total_Clicks_Yesterday'] = df['Date_Yesterday'].map(df.groupby('Date')['Clicks'].sum())
print(df)
         Date Date_Yesterday  Clicks  Total_Clicks_Yesterday
0  2021-09-14     2021-09-13     100                    50.0
1  2021-09-14     2021-09-13     100                    50.0
2  2021-09-14     2021-09-13     100                    50.0
3  2021-09-13     2021-09-12      50                    20.0
4  2021-09-12     2021-09-11      10                     1.0
5  2021-09-12     2021-09-11      10                     1.0
6  2021-09-11     2021-09-10       1                     NaN
Run Code Online (Sandbox Code Playgroud)