对于这个措辞不佳的问题,我深表歉意,但是将它放在一行中是很可闻的。
我有一个日期索引的数据框,其中包含与事件持续时间相关的数据,如下所示:
Date Duration
12-01-2010 5
04-02-2010 1
14-02-2010 241
23-12-2010 6
Run Code Online (Sandbox Code Playgroud)
我想将其转换为一个数据索引,该索引每日索引,包含二进制数据,该数据显示在指定日期是否发生了事件。例如,对于上面确定的持续5天的第一个事件:
Date Event
12-01-2010 1
13-01-2010 1
14-01-2010 1
15-01-2010 1
16-01-2010 1
17-01-2010 0
18-01-2010 0
Run Code Online (Sandbox Code Playgroud)
有任何想法吗?
谢谢
假设您使用的是 pandas 0.25,那么您可以使用explode:
# Generate the list of days that has an event
s = df.apply(lambda row: pd.date_range(row['Date'], periods=row['Duration']), axis=1) \
.explode() \
.drop_duplicates()
# First line: we know those days have at least one event so mark them with 1
# Second line: expand it to cover every day of the year and fill the missing days with 0
result = pd.DataFrame({'Event': 1}, index=s) \
.reindex(pd.date_range('2010-01-01', '2010-12-31'), fill_value=0)
Run Code Online (Sandbox Code Playgroud)
结果:
Event
2010-01-01 0
2010-01-02 0
2010-01-03 0
2010-01-04 0
2010-01-05 0
2010-01-06 0
2010-01-07 0
2010-01-08 0
2010-01-09 0
2010-01-10 0
2010-01-11 0
2010-01-12 1
2010-01-13 1
2010-01-14 1
2010-01-15 1
2010-01-16 1
2010-01-17 0
2010-01-18 0
2010-01-19 0
2010-01-20 0
Run Code Online (Sandbox Code Playgroud)