为一系列事件分配标识符,避免错误观察

GrB*_*rBa 5 r dplyr

我从设备的操作记录器收到的示例数据

df1 <- read.table(text = "temp.1
heating
heating
heating
heating
heating
heating
heating
heating
cooling
heating
heating
heating
heating
heating
heating
cooling
cooling
cooling
cooling
cooling
cooling
cooling
heating
heating
heating
cooling
cooling
heating
heating
heating
cooling
heating
heating
heating
heating
cooling
cooling
cooling
cooling
heating
heating
heating
cooling
heating
cooling
heating
cooling
heating
heating
heating
heating", header = TRUE)
Run Code Online (Sandbox Code Playgroud)

有时,在“加热”期间会出现一次(最多两次)“冷却”观察。这是一个错误,我希望忽略这些值。我想标记修正后的占空比。标记还应包含一个序列号 - 需要有关特定日期发生的加热和冷却循环次数的信息预期结果:

> df1
    temp.1 level
1  heating   H.1
2  heating   H.1
3  heating   H.1
4  heating   H.1
5  heating   H.1
6  heating   H.1
7  heating   H.1
8  heating   H.1
9  cooling   H.1
10 heating   H.1
11 heating   H.1
12 heating   H.1
13 heating   H.1
14 heating   H.1
15 heating   H.1
16 cooling   C.1
17 cooling   C.1
18 cooling   C.1
19 cooling   C.1
20 cooling   C.1
21 cooling   C.1
22 cooling   C.1
23 heating   H.2
24 heating   H.2
25 heating   H.2
26 cooling   H.2
27 cooling   H.2
28 heating   H.2
29 heating   H.2
30 heating   H.2
31 cooling   H.2
32 heating   H.2
33 heating   H.2
34 heating   H.2
35 heating   H.2
36 cooling   C.2
37 cooling   C.2
38 cooling   C.2
39 cooling   C.2
40 heating   H.3
41 heating   H.3
42 heating   H.3
43 cooling   H.3
44 heating   H.3
45 cooling   H.3
46 heating   H.3
47 cooling   H.3
48 heating   H.3
49 heating   H.3
50 heating   H.3
51 heating   H.3
Run Code Online (Sandbox Code Playgroud)

EDIT2:还有一种情况是我没有预料到的,而且我的查询不准确。请看51-53节。当“冷却”系列被单个“加热”中断时,也应该忽略。我尝试修改你的解决方案,但没有成功

df1
     temp.1 level
 1: heating   H.1
 2: heating   H.1
 3: heating   H.1
 4: heating   H.1
 5: heating   H.1
 6: heating   H.1
 7: heating   H.1
 8: heating   H.1
 9: cooling   H.1
10: heating   H.1
11: heating   H.1
12: heating   H.1
13: heating   H.1
14: heating   H.1
15: heating   H.1
16: cooling   C.1
17: cooling   C.1
18: cooling   C.1
19: cooling   C.1
20: cooling   C.1
21: cooling   C.1
22: cooling   C.1
23: heating   H.2
24: heating   H.2
25: heating   H.2
26: cooling   H.2
27: cooling   H.2
28: heating   H.2
29: heating   H.2
30: heating   H.2
31: cooling   H.2
32: heating   H.2
33: heating   H.2
34: heating   H.2
35: heating   H.2
36: cooling   C.2
37: cooling   C.2
38: cooling   C.2
39: cooling   C.2
40: heating   H.3
41: heating   H.3
42: heating   H.3
43: cooling   H.3
44: heating   H.3
45: cooling   H.3
46: heating   H.3
47: cooling   C.3
48: cooling   C.3
49: cooling   C.3
50: cooling   C.3
51: cooling   C.3
52: heating   C.3
53: cooling   C.3
54: cooling   C.3
55: cooling   C.3
56: heating   H.4
57: heating   H.4
58: heating   H.4
Run Code Online (Sandbox Code Playgroud)

“加热”3次后出现“冷却”或“冷却”3次后出现“加热”会将类别更改为“级别”。因此,第 26-27 行被认为是错误,第 23-25 行应该更改“级别”。

Wim*_*pel 6

一种data.table方法

library(data.table)
# set to data.table format
setDT(df1)
# initialise heating or cooling level
df1[, level := toupper(substr(temp.1,1,1))]
# override level of groupsizes size 2 or less with "H"
df1[, level := if (.N <= 2) "H", by = .(rleid(temp.1))]
# tamporary value for indexing, can be dropped at the end
df1[, temp := rleid(level)]
# create the correct level id, and afterwards drop the temp column
df1[, level := paste(level, as.integer(factor(temp)), sep = "."), by = .(level)][, temp := NULL][]
Run Code Online (Sandbox Code Playgroud)

更新更新的样本数据/所需的输出

library(data.table)
setDT(df1)
# determine groups of 3 (or more) consecutive temp.1
df1[, group := if (.N >= 3) .GRP, by = .(rleid(temp.1))]
# fill down missing groupnumbers
setnafill(df1, type = "locf", cols = "group")
# set level letter (from initial answer)
df1[, level := toupper(substr(temp.1[1],1,1)), by = .(group)]
df1[, temp := rleid(level)]
df1[, level := paste(level, as.integer(factor(temp)), sep = "."), by = .(level)][, temp := NULL][]
Run Code Online (Sandbox Code Playgroud)

  • 对于编辑:您如何确定第 52 行或第 53 行是否有错误? (2认同)