找到相等的时间并逐步添加常量

gab*_*how 2 python datetime group-by pandas

我有一个df包含一些时间戳的数据帧

df['Date'].values
Out[16]: 
array(['2015-03-25T14:36:39.199994000', '2015-03-25T14:36:39.199994000',
       '2015-03-26T10:05:03.699999000', '2015-04-19T16:01:49.680009000',
       '2015-04-19T16:36:10.040007000', '2015-04-19T16:36:10.040007000',
       '2015-04-19T16:36:10.040007000'], dtype='datetime64[ns]')
Run Code Online (Sandbox Code Playgroud)

正如您所看到的那样,第一个和第二个时间戳是相等的,但也是最后一个3.

我想扫描数据帧,如果有时间戳相等,则保持第一个,并以相等的其他方式逐步添加5秒.新的数据框应该是这样的

df['Date'].values
Out[16]: 
array(['2015-03-25T14:36:39.199994000', '2015-03-25T14:36:44.199994000',
       '2015-03-26T10:05:03.699999000', '2015-04-19T16:01:49.680009000',
       '2015-04-19T16:36:10.040007000', '2015-04-19T16:36:15.040007000',
       '2015-04-19T16:36:20.040007000'], dtype='datetime64[ns]')
Run Code Online (Sandbox Code Playgroud)

有没有pythonic的方法没有循环..我正在考虑根据时间戳分组,但后来我不知道如何继续...

Flo*_*oor 7

使用groupby cumcount乘以timedelta即

df = pd.DataFrame({'Date':np.array(['2015-03-25T14:36:39.199994000', '2015-03-25T14:36:39.199994000',
   '2015-03-26T10:05:03.699999000', '2015-04-19T16:01:49.680009000',
   '2015-04-19T16:36:10.040007000', '2015-04-19T16:36:10.040007000',
   '2015-04-19T16:36:10.040007000'], dtype='datetime64[ns]')})

df['Date'] + df.groupby(df['Date']).cumcount()*pd.Timedelta('5 seconds')
Run Code Online (Sandbox Code Playgroud)

输出:

0   2015-03-25 14:36:39.199994
1   2015-03-25 14:36:44.199994
2   2015-03-26 10:05:03.699999
3   2015-04-19 16:01:49.680009
4   2015-04-19 16:36:10.040007
5   2015-04-19 16:36:15.040007
6   2015-04-19 16:36:20.040007
dtype: datetime64[ns]