我想替换组内重复的元素
df <- data.frame(A=c("a", "a", "a", "b", "b", "c"), group = c(1, 1, 2, 2, 2, 3))
Run Code Online (Sandbox Code Playgroud)
我想保留组的第一个元素,同时用 NA 替换其他任何元素。就像是:
df <- df %>%
group_by(group) %>%
mutate(B = first(A))
Run Code Online (Sandbox Code Playgroud)
这不会产生我想要的。我想要的是B <- c(a, NA, a, NA, NA, c)
使用replace有duplicated:
df %>% group_by(group) %>% mutate(B = replace(A, duplicated(A), NA))
# A tibble: 6 x 2
# Groups: group [3]
# A group
# <fctr> <dbl>
#1 a 1
#2 NA 1
#3 a 2
#4 b 2
#5 NA 2
#6 c 3
Run Code Online (Sandbox Code Playgroud)
或者如果只保留第一个元素:
df %>%
group_by(group) %>%
mutate(B = ifelse(row_number() == 1, as.character(A), NA))
# A tibble: 6 x 2
# Groups: group [3]
# A group
# <chr> <dbl>
#1 a 1
#2 <NA> 1
#3 a 2
#4 <NA> 2
#5 <NA> 2
#6 c 3
Run Code Online (Sandbox Code Playgroud)
或使用replace:
df %>%
group_by(group) %>%
mutate(B = replace(A, row_number() > 1, NA))
# A tibble: 6 x 2
# Groups: group [3]
# A group
# <fctr> <dbl>
#1 a 1
#2 NA 1
#3 a 2
#4 NA 2
#5 NA 2
#6 c 3
Run Code Online (Sandbox Code Playgroud)