Hor*_*man 3 python label pandas
我需要解决以下问题。我拥有的是时间戳和值。该值可以正值、负值变化或保持稳定。一旦它从一行到另一行发生积极变化或保持稳定,我想在新列中添加一个标签。如果值继续增加,则应将相同的标签添加到该行。一旦值变为负值,就应输入零作为标签。谁能帮我?
输入数据
df_raw = pd.DataFrame(
{
"timestamp": [
"2017-06-16 05:19:18.993",
"2017-06-16 05:19:28.993",
"2017-06-16 05:19:38.993",
"2017-06-16 05:19:48.993",
"2017-06-16 05:19:58.993",
"2017-06-16 05:25:08.993",
"2017-06-16 05:25:18.993",
"2017-06-16 07:44:28.993",
"2017-06-16 07:45:38.993",
],
"signalvalue": [0.0, 12.0, 22.0, 13.0, 0.0, 30.0, 0.0, 3.0, 6.0],
}
)
timestamp signalvalue
0 2017-06-16 05:19:18.993 0.0
1 2017-06-16 05:19:28.993 12.0
2 2017-06-16 05:19:38.993 22.0
3 2017-06-16 05:19:48.993 13.0
4 2017-06-16 05:19:58.993 0.0
5 2017-06-16 05:25:08.993 30.0
6 2017-06-16 05:25:18.993 0.0
7 2017-06-16 07:44:28.993 3.0
8 2017-06-16 07:45:38.993 6.0
Run Code Online (Sandbox Code Playgroud)
所需输出
timestamp signalvalue label
0 2017-06-16 05:19:18.993 0.0 0
1 2017-06-16 05:19:28.993 12.0 1
2 2017-06-16 05:19:38.993 22.0 1
3 2017-06-16 05:19:48.993 13.0 0
4 2017-06-16 05:19:58.993 0.0 0
5 2017-06-16 05:25:08.993 30.0 2
6 2017-06-16 05:25:18.993 0.0 0
7 2017-06-16 07:44:28.993 3.0 3
8 2017-06-16 07:45:38.993 6.0 3
Run Code Online (Sandbox Code Playgroud)
diff如果大于零,您可以根据连续值计算掩码。然后仅保留每个拉伸的第一项来计算 a cumsum:
m1= df_raw['signalvalue'].diff().gt(0)
df_raw['label'] = (m1&m1.ne(m1.shift())).cumsum()*m1.astype(int)
Run Code Online (Sandbox Code Playgroud)
输出:
timestamp signalvalue label
0 2017-06-16 05:19:18.993 0.0 0
1 2017-06-16 05:19:28.993 12.0 1
2 2017-06-16 05:19:38.993 22.0 1
3 2017-06-16 05:19:48.993 13.0 0
4 2017-06-16 05:19:58.993 0.0 0
5 2017-06-16 05:25:08.993 30.0 2
6 2017-06-16 05:25:18.993 0.0 0
7 2017-06-16 07:44:28.993 3.0 3
8 2017-06-16 07:45:38.993 6.0 3
Run Code Online (Sandbox Code Playgroud)