我有一个如下所示的数据框:
ID value condition
A 0 0
A 3 0
A 0 1
A 7 1
A 5 0
A 5 0
A 5 0
A 7 0
B 6 0
B 2 1
B 7 0
B 10 1
B 0 0
B 6 0
Run Code Online (Sandbox Code Playgroud)
我想在满足条件时更改ID名称,并更改后面的ID名称.每个ID可以满足多次条件,所以我每次都要修改它.
结果将更改原始ID或只添加新列:
ID value condition newID
A 0 0 A
A 3 0 A
A 0 1 A1
A 7 1 A1
A 5 0 A2
A 5 0 A2
A 5 0 A2
A 7 0 A2
B 6 0 B
B 2 1 B1
B 7 0 B2
B 10 1 B3
B 0 0 B4
B 6 0 B4
Run Code Online (Sandbox Code Playgroud)
按"ID"分组后的一个选项是,使用rleid(from data.table)创建索引,并paste根据条件将其更改为"ID"case_when
library(dplyr)
library(data.table)
df1 %>%
group_by(ID) %>%
mutate(newID = rleid(condition)-1,
newID = case_when(newID == 0 ~ first(ID), TRUE ~ paste0(first(ID), newID)))
# A tibble: 14 x 4
# Groups: ID [2]
# ID value condition newID
# <chr> <int> <int> <chr>
# 1 A 0 0 A
# 2 A 3 0 A
# 3 A 0 1 A1
# 4 A 7 1 A1
# 5 A 5 0 A2
# 6 A 5 0 A2
# 7 A 5 0 A2
# 8 A 7 0 A2
# 9 B 6 0 B
#10 B 2 1 B1
#11 B 7 0 B2
#12 B 10 1 B3
#13 B 0 0 B4
#14 B 6 0 B4
Run Code Online (Sandbox Code Playgroud)
df1 <- structure(list(ID = c("A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B"), value = c(0L, 3L, 0L, 7L, 5L,
5L, 5L, 7L, 6L, 2L, 7L, 10L, 0L, 6L), condition = c(0L, 0L, 1L,
1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L)), class = "data.frame",
row.names = c(NA, -14L))
Run Code Online (Sandbox Code Playgroud)