我从设备的操作记录器收到的示例数据
df1 <- read.table(text = "temp.1
heating
heating
heating
heating
heating
heating
heating
heating
cooling
heating
heating
heating
heating
heating
heating
cooling
cooling
cooling
cooling
cooling
cooling
cooling
heating
heating
heating
cooling
cooling
heating
heating
heating
cooling
heating
heating
heating
heating
cooling
cooling
cooling
cooling
heating
heating
heating
cooling
heating
cooling
heating
cooling
heating
heating
heating
heating", header = TRUE)
Run Code Online (Sandbox Code Playgroud)
有时,在“加热”期间会出现一次(最多两次)“冷却”观察。这是一个错误,我希望忽略这些值。我想标记修正后的占空比。标记还应包含一个序列号 - 需要有关特定日期发生的加热和冷却循环次数的信息预期结果:
> df1
temp.1 level
1 heating H.1
2 heating H.1
3 heating H.1
4 heating H.1
5 heating H.1
6 heating H.1
7 heating H.1
8 heating H.1
9 cooling H.1
10 heating H.1
11 heating H.1
12 heating H.1
13 heating H.1
14 heating H.1
15 heating H.1
16 cooling C.1
17 cooling C.1
18 cooling C.1
19 cooling C.1
20 cooling C.1
21 cooling C.1
22 cooling C.1
23 heating H.2
24 heating H.2
25 heating H.2
26 cooling H.2
27 cooling H.2
28 heating H.2
29 heating H.2
30 heating H.2
31 cooling H.2
32 heating H.2
33 heating H.2
34 heating H.2
35 heating H.2
36 cooling C.2
37 cooling C.2
38 cooling C.2
39 cooling C.2
40 heating H.3
41 heating H.3
42 heating H.3
43 cooling H.3
44 heating H.3
45 cooling H.3
46 heating H.3
47 cooling H.3
48 heating H.3
49 heating H.3
50 heating H.3
51 heating H.3
Run Code Online (Sandbox Code Playgroud)
EDIT2:还有一种情况是我没有预料到的,而且我的查询不准确。请看51-53节。当“冷却”系列被单个“加热”中断时,也应该忽略。我尝试修改你的解决方案,但没有成功
df1
temp.1 level
1: heating H.1
2: heating H.1
3: heating H.1
4: heating H.1
5: heating H.1
6: heating H.1
7: heating H.1
8: heating H.1
9: cooling H.1
10: heating H.1
11: heating H.1
12: heating H.1
13: heating H.1
14: heating H.1
15: heating H.1
16: cooling C.1
17: cooling C.1
18: cooling C.1
19: cooling C.1
20: cooling C.1
21: cooling C.1
22: cooling C.1
23: heating H.2
24: heating H.2
25: heating H.2
26: cooling H.2
27: cooling H.2
28: heating H.2
29: heating H.2
30: heating H.2
31: cooling H.2
32: heating H.2
33: heating H.2
34: heating H.2
35: heating H.2
36: cooling C.2
37: cooling C.2
38: cooling C.2
39: cooling C.2
40: heating H.3
41: heating H.3
42: heating H.3
43: cooling H.3
44: heating H.3
45: cooling H.3
46: heating H.3
47: cooling C.3
48: cooling C.3
49: cooling C.3
50: cooling C.3
51: cooling C.3
52: heating C.3
53: cooling C.3
54: cooling C.3
55: cooling C.3
56: heating H.4
57: heating H.4
58: heating H.4
Run Code Online (Sandbox Code Playgroud)
“加热”3次后出现“冷却”或“冷却”3次后出现“加热”会将类别更改为“级别”。因此,第 26-27 行被认为是错误,第 23-25 行应该更改“级别”。
一种data.table方法
library(data.table)
# set to data.table format
setDT(df1)
# initialise heating or cooling level
df1[, level := toupper(substr(temp.1,1,1))]
# override level of groupsizes size 2 or less with "H"
df1[, level := if (.N <= 2) "H", by = .(rleid(temp.1))]
# tamporary value for indexing, can be dropped at the end
df1[, temp := rleid(level)]
# create the correct level id, and afterwards drop the temp column
df1[, level := paste(level, as.integer(factor(temp)), sep = "."), by = .(level)][, temp := NULL][]
Run Code Online (Sandbox Code Playgroud)
更新更新的样本数据/所需的输出
library(data.table)
setDT(df1)
# determine groups of 3 (or more) consecutive temp.1
df1[, group := if (.N >= 3) .GRP, by = .(rleid(temp.1))]
# fill down missing groupnumbers
setnafill(df1, type = "locf", cols = "group")
# set level letter (from initial answer)
df1[, level := toupper(substr(temp.1[1],1,1)), by = .(group)]
df1[, temp := rleid(level)]
df1[, level := paste(level, as.integer(factor(temp)), sep = "."), by = .(level)][, temp := NULL][]
Run Code Online (Sandbox Code Playgroud)