添加额外的列作为累积时差

Car*_*cca 5 python timestamp dataframe pandas

如何添加一个额外的列,该列是每个课程的时间差的累积值?例如,初始表为:

 id_A       course     weight                ts_A       value
 id1        cotton     3.5       2017-04-27 01:35:30  150.000000
 id1        cotton     3.5       2017-04-27 01:36:00  416.666667
 id1        cotton     3.5       2017-04-27 01:36:30  700.000000
 id1        cotton     3.5       2017-04-27 01:37:00  950.000000
 id2     cotton blue   5.0       2017-04-27 02:35:30  150.000000
 id2     cotton blue   5.0       2017-04-27 02:36:00  450.000000
 id2     cotton blue   5.0       2017-04-27 02:36:30  520.666667
 id2     cotton blue   5.0       2017-04-27 02:37:00  610.000000
Run Code Online (Sandbox Code Playgroud)

预期的结果是:

 id_A       course     weight                ts_A       value      cum_delta_sec
 id1        cotton     3.5       2017-04-27 01:35:30  150.000000      0
 id1        cotton     3.5       2017-04-27 01:36:00  416.666667      30 
 id1        cotton     3.5       2017-04-27 01:36:30  700.000000      60
 id1        cotton     3.5       2017-04-27 01:37:00  950.000000      90
 id2     cotton blue   5.0       2017-04-27 02:35:30  150.000000      0
 id2     cotton blue   5.0       2017-04-27 02:36:00  450.000000      30
 id2     cotton blue   5.0       2017-04-27 02:36:30  520.666667      60
 id2     cotton blue   5.0       2017-04-27 02:37:00  610.000000      90
Run Code Online (Sandbox Code Playgroud)

Psi*_*dom 6

您可以使用以下diff方法链接该方法cumsum

# convert ts_A to datetime type
df.ts_A = pd.to_datetime(df.ts_A)

# convert ts_A to seconds, group by id and then use transform to calculate the cumulative difference
df['cum_delta_sec'] = df.ts_A.astype(int).div(10**9).groupby(df.id_A).transform(lambda x: x.diff().fillna(0).cumsum())
df
Run Code Online (Sandbox Code Playgroud)

在此处输入图片说明


Sco*_*ton 5

使用groupbytransform、 和.iloc

df['ts_A'] = pd.to_datetime(df.ts_A)
df['cum_delta_sec'] = (df.groupby('id_A')['ts_A']
                         .transform(lambda x: (x - x.iloc[0]).dt.total_seconds()))
Run Code Online (Sandbox Code Playgroud)

输出:

  id_A       course  weight                ts_A       value  cum_delta_sec
0  id1       cotton     3.5 2017-04-27 01:35:30  150.000000              0
1  id1       cotton     3.5 2017-04-27 01:36:00  416.666667             30
2  id1       cotton     3.5 2017-04-27 01:36:30  700.000000             60
3  id1       cotton     3.5 2017-04-27 01:37:00  950.000000             90
4  id2  cotton blue     5.0 2017-04-27 02:35:30  150.000000              0
5  id2  cotton blue     5.0 2017-04-27 02:36:00  450.000000             30
6  id2  cotton blue     5.0 2017-04-27 02:36:30  520.666667             60
7  id2  cotton blue     5.0 2017-04-27 02:37:00  610.000000             90
Run Code Online (Sandbox Code Playgroud)

在组中,从第一个值减去当前值并使用.dt访问器转换为秒。