使用 dplyr 根据列值范围选择列

Tho*_*ley 5 r dplyr

假设我有这个数据集:

df <- data.frame(a = rep(1:2, 5), 
                 b = c("value", "character", "string", "anotherstring", "character", NA, "code", "variable", NA, "cell"), 
                 c = c(1, 2, 5, 4, 5, 7, 8, 9, 6, 10),
                 d = rep(2:1, 5), 
                 e = rep(1, 10))

df
   a             b  c d e
1  1         value  1 2 1
2  2     character  2 1 1
3  1        string  5 2 1
4  2 anotherstring  4 1 1
5  1     character  5 2 1
6  2          <NA>  7 1 1
7  1          code  8 2 1
8  2      variable  9 1 1
9  1          <NA>  6 2 1
10 2          cell 10 1 1
Run Code Online (Sandbox Code Playgroud)

我想从df中选择值为 1 和 2 的列(因此只有ad列)。假设我不知道列名,是否有一种有效的方法根据 dplyr 中列值的范围对数据进行子集化?我最初尝试使用select_ifselect_at没有成功。提前致谢!

Ron*_*hah 7

You can use :

library(dplyr)
df %>%  select_if(~any(. == 1) & any(. == 2) & all(. %in% 1:2))

#   a d
#1  1 2
#2  2 1
#3  1 2
#4  2 1
#5  1 2
#6  2 1
#7  1 2
#8  2 1
#9  1 2
#10 2 1
Run Code Online (Sandbox Code Playgroud)

which in newer version of dplyr can be written as :

df %>%  select(where(~any(. == 1) & any(. == 2) & all(. %in% 1:2)))
Run Code Online (Sandbox Code Playgroud)

Same in base R Filter :

Filter(function(x) any(x == 1) & any(x == 2) & all(x %in% 1:2) , df)
Run Code Online (Sandbox Code Playgroud)