所以我有一个像这样的DataFrame:
N start
1 1 08/01/2014 9:30:02
2 1 08/01/2014 10:30:02
3 2 08/01/2014 12:30:02
4 3 08/01/2014 4:30:02
Run Code Online (Sandbox Code Playgroud)
我需要复制每一行N次,每次增加一小时,如下所示:
N start
1 1 08/01/2014 9:30:02
2 1 08/01/2014 10:30:02
3 2 08/01/2014 12:30:02
3 2 08/01/2014 13:30:02
4 3 08/01/2014 4:30:02
4 3 08/01/2014 5:30:02
4 3 08/01/2014 6:30:02
Run Code Online (Sandbox Code Playgroud)
我怎么能在熊猫里做到这一点?
您可以使用reindex扩展DataFrame,使用TimedeltaIndex来添加小时:
import pandas as pd
df = pd.DataFrame({'N': [1, 1, 2, 3],
'start': ['08/01/2014 9:30:02',
'08/01/2014 10:30:02',
'08/01/2014 12:30:02',
'08/01/2014 4:30:02']})
df['start'] = pd.to_datetime(df['start'])
df = df.reindex(np.repeat(df.index.values, df['N']), method='ffill')
df['start'] += pd.TimedeltaIndex(df.groupby(level=0).cumcount(), unit='h')
Run Code Online (Sandbox Code Playgroud)
产量
N start
0 1 2014-08-01 09:30:02
1 1 2014-08-01 10:30:02
2 2 2014-08-01 12:30:02
2 2 2014-08-01 13:30:02
3 3 2014-08-01 04:30:02
3 3 2014-08-01 05:30:02
3 3 2014-08-01 06:30:02
Run Code Online (Sandbox Code Playgroud)