我正在尝试使用我服药的通常时间(因此 + 4 小时以上)并在数据框中填写一个带有标签的数据框,为 2,1 或 0,用于我服用这种药物的时间,或服药后一小时为 2 小时,因为刚停药。
作为数据框的示例,我也尝试添加此列,
<bound method NDFrame.to_clipboard of id sentiment magnitude angry disgusted fearful \
created
2020-05-21 12:00:00 23.0 -0.033333 0.5 NaN NaN NaN
2020-05-21 12:15:00 NaN NaN NaN NaN NaN NaN
2020-05-21 12:30:00 NaN NaN NaN NaN NaN NaN
2020-05-21 12:45:00 NaN NaN NaN NaN NaN NaN
2020-05-21 13:00:00 NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ...
2021-04-20 00:45:00 NaN NaN NaN NaN NaN NaN
2021-04-20 01:00:00 NaN NaN NaN NaN NaN NaN
2021-04-20 01:15:00 NaN NaN NaN NaN NaN NaN
2021-04-20 01:30:00 NaN NaN NaN NaN NaN NaN
2021-04-20 01:45:00 46022.0 -1.000000 1.0 NaN NaN NaN
happy neutral sad surprised
created
2020-05-21 12:00:00 NaN NaN NaN NaN
2020-05-21 12:15:00 NaN NaN NaN NaN
2020-05-21 12:30:00 NaN NaN NaN NaN
2020-05-21 12:45:00 NaN NaN NaN NaN
2020-05-21 13:00:00 NaN NaN NaN NaN
... ... ... ... ...
2021-04-20 00:45:00 NaN NaN NaN NaN
2021-04-20 01:00:00 NaN NaN NaN NaN
2021-04-20 01:15:00 NaN NaN NaN NaN
2021-04-20 01:30:00 NaN NaN NaN NaN
2021-04-20 01:45:00 NaN NaN NaN NaN
[32024 rows x 10 columns]>
Run Code Online (Sandbox Code Playgroud)
以及我通常服药时的时间戳数据,
['09:00 AM', '12:00 PM', '03:00 PM']
Run Code Online (Sandbox Code Playgroud)
我将如何使用这些时间戳来获取此类列信息?
因此,尝试基于这个问题,我将如何确保它只针对有可用数据的地方添加药物,并确保正确应用一小时的用药后时间!
谢谢
用于np.select()为给定条件选择适当的标签。
首先dropna(),如果之后的所有值都created为 null ( subset=df.columns[1:])。subset您可以根据您的需要进行更改(例如,subset=['id']是否应该仅因具有 null 而删除行id)。
然后datetime根据duration药物的情况生成服药期间、用药期间和服药后期间的数组。检查时间是否与(标签 1)或(标签 2)created中的任何时间匹配,否则默认为 0。activeafter
# drop rows that are empty except for column 0 (i.e., except for df.created)
df.dropna(subset=df.columns[1:], inplace=True)
# convert times to datetime
df.created = pd.to_datetime(df.created)
taken = pd.to_datetime(['09:00:00', '12:00:00', '15:00:00'])
# generate time arrays
duration = 2 # hours
active = np.array([(taken + pd.Timedelta(f'{h}H')).time for h in range(duration)]).ravel()
after = (taken + pd.Timedelta(f'{duration}H')).time
# define boolean masks by label
conditions = {
1: df.created.dt.floor('H').dt.time.isin(active),
2: df.created.dt.floor('H').dt.time.isin(after),
}
# create medication column with np.select()
df['medication'] = np.select(conditions.values(), conditions.keys(), default=0)
Run Code Online (Sandbox Code Playgroud)
下面是一些稍微修改过的数据的输出,可以更好地演示active//场景:afternan
created id sentiment magnitude medication
0 2020-05-21 12:00:00 23.0 -0.033333 0.5 1
3 2020-05-21 12:45:00 39.0 -0.500000 0.5 1
4 2020-05-21 13:00:00 90.0 -0.500000 0.5 1
5 2020-05-21 13:15:00 100.0 -0.033333 0.1 1
9 2020-05-21 14:15:00 1000.0 0.033333 0.5 2
10 2020-05-21 14:30:00 3.0 0.001000 1.0 2
17 2021-04-20 01:00:00 46022.0 -1.000000 1.0 0
20 2021-04-20 01:45:00 46022.0 -1.000000 1.0 0
Run Code Online (Sandbox Code Playgroud)