Pandas 日期时间间隔重采样到秒

r0f*_*0f1 6 python pandas

给定以下数据框:

import pandas as pd

pd.DataFrame({"start": ["2017-01-01 13:09:01", "2017-01-01 13:09:07", "2017-01-01 13:09:12"],
         "end":    ["2017-01-01 13:09:05", "2017-01-01 13:09:09", "2017-01-01 13:09:14"],
         "status": ["OK", "ERROR", "OK"]})
Run Code Online (Sandbox Code Playgroud)

有:

| start               | end                 | status |
|---------------------|---------------------|--------|
| 2017-01-01 13:09:01 | 2017-01-01 13:09:05 | OK     |
| 2017-01-01 13:09:07 | 2017-01-01 13:09:09 | ERROR  | 
| 2017-01-01 13:09:12 | 2017-01-01 13:09:14 | OK     |
Run Code Online (Sandbox Code Playgroud)

我想将其转换为另一种格式,即“展开”间隔并将它们转换为 DatetimeIndex,然后重新采样数据。结果应如下所示:

想:

|                     | status    |
|---------------------|-----------|
| 2017-01-01 13:09:01 | OK        |
| 2017-01-01 13:09:02 | OK        |
| 2017-01-01 13:09:03 | OK        |
| 2017-01-01 13:09:04 | OK        |
| 2017-01-01 13:09:05 | OK        |
| 2017-01-01 13:09:06 | NAN       |
| 2017-01-01 13:09:07 | ERROR     |
| 2017-01-01 13:09:08 | ERROR     |
| 2017-01-01 13:09:09 | ERROR     |
| 2017-01-01 13:09:10 | NAN       |
| 2017-01-01 13:09:11 | NAN       |
| 2017-01-01 13:09:12 | OK        |
| 2017-01-01 13:09:13 | OK        |
| 2017-01-01 13:09:14 | OK        |
Run Code Online (Sandbox Code Playgroud)

很感谢任何形式的帮助!

roo*_*oot 6

使用IntervalIndex

# create an IntervalIndex from start/end
iv_idx = pd.IntervalIndex.from_arrays(df['start'], df['end'], closed='both')

# generate the desired index of individual times
new_idx = pd.date_range(df['start'].min(), df['end'].max(), freq='s')

# set the index of 'status' as the IntervalIndex, then reindex to the new index
result = df['status'].set_axis(iv_idx, inplace=False).reindex(new_idx)
Run Code Online (Sandbox Code Playgroud)

的结果输出result

2017-01-01 13:09:01       OK
2017-01-01 13:09:02       OK
2017-01-01 13:09:03       OK
2017-01-01 13:09:04       OK
2017-01-01 13:09:05       OK
2017-01-01 13:09:06      NaN
2017-01-01 13:09:07    ERROR
2017-01-01 13:09:08    ERROR
2017-01-01 13:09:09    ERROR
2017-01-01 13:09:10      NaN
2017-01-01 13:09:11      NaN
2017-01-01 13:09:12       OK
2017-01-01 13:09:13       OK
2017-01-01 13:09:14       OK
Freq: S, Name: status, dtype: object
Run Code Online (Sandbox Code Playgroud)