Raf*_*ael 16 python time timedelta pandas
我有以下数据帧:
Time Work
2018-12-01 10:00:00 Off
2018-12-01 10:00:02 On
2018-12-01 10:00:05 On
2018-12-01 10:00:06 On
2018-12-01 10:00:07 On
2018-12-01 10:00:09 Off
2018-12-01 10:00:11 Off
2018-12-01 10:00:14 On
2018-12-01 10:00:16 On
2018-12-01 10:00:18 On
2018-12-01 10:00:20 Off
Run Code Online (Sandbox Code Playgroud)
我想创建一个新列,其中包含自设备开始工作以来经过的时间.
Time Work Elapsed Time
2018-12-01 10:00:00 Off 0
2018-12-01 10:00:02 On 2
2018-12-01 10:00:05 On 5
2018-12-01 10:00:06 On 6
2018-12-01 10:00:07 On 7
2018-12-01 10:00:09 Off 0
2018-12-01 10:00:11 Off 0
2018-12-01 10:00:14 On 3
2018-12-01 10:00:16 On 5
2018-12-01 10:00:18 On 7
2018-12-01 10:00:20 Off 0
Run Code Online (Sandbox Code Playgroud)
我该怎么做?
cs9*_*s95 14
你可以使用groupby:
# df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
sec = df['Time'].dt.second
df['Elapsed Time'] = (
sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))
df
Time Work Elapsed Time
0 2018-12-01 10:00:00 Off 0
1 2018-12-01 10:00:02 On 2
2 2018-12-01 10:00:05 On 5
3 2018-12-01 10:00:06 On 6
4 2018-12-01 10:00:07 On 7
5 2018-12-01 10:00:09 Off 0
6 2018-12-01 10:00:11 Off 0
7 2018-12-01 10:00:14 On 3
8 2018-12-01 10:00:16 On 5
9 2018-12-01 10:00:18 On 7
10 2018-12-01 10:00:20 Off 0
Run Code Online (Sandbox Code Playgroud)
想法是提取秒部分并从状态从"关"变为"开"的第一时刻减去经过的时间.这是使用transform和完成的first.
cumsum 用于识别组:
df.Work.eq('Off').cumsum()
0 1
1 1
2 1
3 1
4 1
5 2
6 3
7 3
8 3
9 3
10 4
Name: Work, dtype: int64
Run Code Online (Sandbox Code Playgroud)
如果您的设备有可能在"开启"时跨越多分钟,则初始化sec为:
sec = df['Time'].values.astype(np.int64) // 10e8
df['Elapsed Time'] = (
sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))
df
Time Work Elapsed Time
0 2018-12-01 10:00:00 Off 0.0
1 2018-12-01 10:00:02 On 2.0
2 2018-12-01 10:00:05 On 5.0
3 2018-12-01 10:00:06 On 6.0
4 2018-12-01 10:00:07 On 7.0
5 2018-12-01 10:00:09 Off 0.0
6 2018-12-01 10:00:11 Off 0.0
7 2018-12-01 10:00:14 On 3.0
8 2018-12-01 10:00:16 On 5.0
9 2018-12-01 10:00:18 On 7.0
10 2018-12-01 10:00:20 Off 0.0
Run Code Online (Sandbox Code Playgroud)
IIUC first与transform
(df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
Out[1090]:
0 0
1 2
2 5
3 6
4 7
5 0
6 0
7 3
8 5
9 7
10 0
Name: Time, dtype: int64
Run Code Online (Sandbox Code Playgroud)
你可以用两个groupbys.第一个计算每个组内的时差.然后第二个对每个组内的那些进行求和.
s = (df.Work=='Off').cumsum()
df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()
Run Code Online (Sandbox Code Playgroud)
Time Work Elapsed Time
0 2018-12-01 10:00:00 Off 0.0
1 2018-12-01 10:00:02 On 2.0
2 2018-12-01 10:00:05 On 5.0
3 2018-12-01 10:00:06 On 6.0
4 2018-12-01 10:00:07 On 7.0
5 2018-12-01 10:00:09 Off 0.0
6 2018-12-01 10:00:11 Off 0.0
7 2018-12-01 10:00:14 On 3.0
8 2018-12-01 10:00:16 On 5.0
9 2018-12-01 10:00:18 On 7.0
10 2018-12-01 10:00:20 Off 0.0
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
639 次 |
| 最近记录: |