如何选择包含某些字符串/字符的特定列?

hsl*_*hsl 6 r dataframe dplyr

我有这个数据帧:

df1 <- data.frame(a = c("correct", "wrong", "wrong", "correct"),
  b = c(1, 2, 3, 4),
  c = c("wrong", "wrong", "wrong", "wrong"),
  d = c(2, 2, 3, 4))

a       b c     d
correct 1 wrong 2
wrong   2 wrong 2
wrong   3 wrong 3
correct 4 wrong 4
Run Code Online (Sandbox Code Playgroud)

并且只想选择字符串'correct'或'wrong'(即df1中的列b和d)的列,这样我就可以得到这个数据帧:

df2 <- data.frame(a = c("correct", "wrong", "wrong", "correct"),
        c = c("wrong", "wrong", "wrong", "wrong"))

        a     c
1 correct wrong
2   wrong wrong
3   wrong wrong
4 correct wrong
Run Code Online (Sandbox Code Playgroud)

我可以使用dplyr来做到这一点吗?如果没有,我可以使用哪些功能来执行此操作?我给出的例子很简单,因为我可以这样做(dplyr):

select(df1, a, c)
Run Code Online (Sandbox Code Playgroud)

但是,在我的实际数据框中,我有大约700个变量/列和几百列包含字符串'正确'或'错误',我不知道变量/列名称.

有关如何快速完成此操作的任何建议?非常感谢!

Col*_*vel 9

您可以使用R Filter将对每个df1列进行操作的base ,并使所有列满足逻辑测试的功能:

Filter(function(u) any(c('wrong','correct') %in% u), df1)
#        a     c
#1 correct wrong
#2   wrong wrong
#3   wrong wrong
#4 correct wrong
Run Code Online (Sandbox Code Playgroud)

您还可以使用grepl:

Filter(function(u) any(grepl('wrong|correct',u)), df1)
Run Code Online (Sandbox Code Playgroud)