Sil*_*iss 5 r data-manipulation mutate
我有一个数据框df。
df <- data.frame(ID = c(1,1,1,2,2,2,3,3,3,4,4,4,4), process = c("inspection", "evaluation", "result","inspection", "result", "evaluation", "result", "inspection","result","evaluation","result","result","evaluation"))
Run Code Online (Sandbox Code Playgroud)
我需要插入一列true_process,如果evaluation出现在result特定的之前ID,那么它就是true。如果它出现在后面或丢失,它应该取值false。
我试过的代码。
library(dplyr)
df %>%
group_by(ID) %>%
mutate(true_process = case_when(
!any(process == "evaluation") ~ "False",
length(process == "evaluation")[[1]] > length(process == "result")[[1]] ~ "False",
TRUE ~ "True"
))
# A tibble: 13 x 3
# Groups: ID [4]
ID process true_process
<dbl> <fct> <chr>
1 1 inspection True
2 1 evaluation True
3 1 result True
4 2 inspection True
5 2 result True
6 2 evaluation True
7 3 result False
8 3 inspection False
9 3 result False
10 4 evaluation True
11 4 result True
12 4 result True
13 4 evaluation True
Run Code Online (Sandbox Code Playgroud)
预期输出如下
# A tibble: 13 x 3
# Groups: ID [4]
ID process true_process
<dbl> <fct> <lgl>
1 1 inspection TRUE
2 1 evaluation TRUE
3 1 result TRUE
4 2 inspection FALSE
5 2 result FALSE
6 2 evaluation FALSE
7 3 result FALSE
8 3 inspection FALSE
9 3 result FALSE
10 4 evaluation FALSE
11 4 result FALSE
12 4 result FALSE
13 4 evaluation FALSE
Run Code Online (Sandbox Code Playgroud)
根据更新的数据,您可以检查 的最后一个实例的索引是否evaluation小于 的任何索引result。
library(dplyr)
df %>%
group_by(ID) %>%
mutate(true_process = any(tail(which(process == "evaluation"), 1) < which(process == "result")))
# A tibble: 13 x 3
# Groups: ID [4]
ID process true_process
<dbl> <chr> <lgl>
1 1 inspection TRUE
2 1 evaluation TRUE
3 1 result TRUE
4 2 inspection FALSE
5 2 result FALSE
6 2 evaluation FALSE
7 3 result FALSE
8 3 inspection FALSE
9 3 result FALSE
10 4 evaluation FALSE
11 4 result FALSE
12 4 result FALSE
13 4 evaluation FALSE
Run Code Online (Sandbox Code Playgroud)