Pandas - 从每个组的最大日期减去最小日期

Sup*_*man 4 python group-by pandas

我想添加一个列,该列是从每个customer_id的最大日期减去此表的最小日期的结果

输入:

action_date customer_id
 2017-08-15       1
 2017-08-21       1
 2017-08-21       1
 2017-09-02       1
 2017-08-28       2
 2017-09-29       2
 2017-10-15       3   
 2017-10-30       3
 2017-12-05       3
Run Code Online (Sandbox Code Playgroud)

得到这张桌子

输出:

action_date customer_id    diff
 2017-08-15       1         18
 2017-08-21       1         18
 2017-08-21       1         18
 2017-09-02       1         18
 2017-08-28       2         32
 2017-09-29       2         32
 2017-10-15       3         51
 2017-10-30       3         51
 2017-12-05       3         51
Run Code Online (Sandbox Code Playgroud)

我尝试了这个代码,但它放了很多NaN

group = df.groupby(by='customer_id')
df['diff'] = (group['action_date'].max() - group['action_date'].min()).dt.days
Run Code Online (Sandbox Code Playgroud)

Max*_*axU 9

你可以使用transform方法:

In [23]: df['diff'] = df.groupby('customer_id') \
                        ['action_date'] \
                        .transform(lambda x: (x.max()-x.min()).days)

In [24]: df
Out[24]:
  action_date  customer_id  diff
0  2017-08-15            1    18
1  2017-08-21            1    18
2  2017-08-21            1    18
3  2017-09-02            1    18
4  2017-08-28            2    32
5  2017-09-29            2    32
6  2017-10-15            3    51
7  2017-10-30            3    51
8  2017-12-05            3    51
Run Code Online (Sandbox Code Playgroud)