我想列出小标题中所有出现的情况,我想使用na_ifdplyr 包中的函数将其转换为缺失,但我似乎没有得到正确的结果。有线索吗?
library(dplyr)
set.seed(123)
df <- tibble(
a1 = c("one", "three", "97", "twenty", "98"),
a2 = c("R", "Python", "99", "Java", "97"),
a3 = c("statistics", "Data", "Programming", "99", "Science"),
a4 = floor(rnorm(5, 80, 2))
)
#--- The long route
df1 <- df %>%
mutate(across(where(is.character), ~na_if(., "97")),
across(where(is.character), ~na_if(., "98")),
across(where(is.character), ~na_if(., "99")))
#---- Trial
df2 <- df %>%
mutate(across(where(is.character),
~na_if(., c("97", "98", "99"))))
Run Code Online (Sandbox Code Playgroud)
您可以使用:
\ndf %>%\n mutate(\n across(\n where(is.character),\n ~if_else(. %in% c("97", "98", "99"), NA_character_, .)\n )\n )\nRun Code Online (Sandbox Code Playgroud)\n# A tibble: 5 \xc3\x97 4\n a1 a2 a3 a4\n <chr> <chr> <chr> <dbl>\n1 one R statistics 80\n2 three Python Data 80\n3 NA NA Programming 76\n4 twenty Java NA 83\n5 NA NA Science 78\nRun Code Online (Sandbox Code Playgroud)\n原因na_if在这里不起作用是因为~na_if(., c("97", "98", "99"))基本上相当于if_else(. == c("97", "98", "99"), NA_character_, .). 换句话说,它仅以成对方式比较向量。您可以看到为什么这是一个问题:
# A tibble: 5 \xc3\x97 4\n a1 a2 a3 a4\n <chr> <chr> <chr> <dbl>\n1 one R statistics 80\n2 three Python Data 80\n3 NA NA Programming 76\n4 twenty Java NA 83\n5 NA NA Science 78\nRun Code Online (Sandbox Code Playgroud)\n