我有这个数据帧:
df1 <- data.frame(a = c("correct", "wrong", "wrong", "correct"),
b = c(1, 2, 3, 4),
c = c("wrong", "wrong", "wrong", "wrong"),
d = c(2, 2, 3, 4))
a b c d
correct 1 wrong 2
wrong 2 wrong 2
wrong 3 wrong 3
correct 4 wrong 4
Run Code Online (Sandbox Code Playgroud)
并且只想选择字符串'correct'或'wrong'(即df1中的列b和d)的列,这样我就可以得到这个数据帧:
df2 <- data.frame(a = c("correct", "wrong", "wrong", "correct"),
c = c("wrong", "wrong", "wrong", "wrong"))
a c
1 correct wrong
2 wrong wrong
3 wrong wrong
4 correct wrong
Run Code Online (Sandbox Code Playgroud)
我可以使用dplyr来做到这一点吗?如果没有,我可以使用哪些功能来执行此操作?我给出的例子很简单,因为我可以这样做(dplyr):
select(df1, a, c)
Run Code Online (Sandbox Code Playgroud)
但是,在我的实际数据框中,我有大约700个变量/列和几百列包含字符串'正确'或'错误',我不知道变量/列名称.
有关如何快速完成此操作的任何建议?非常感谢!
您可以使用R
Filter
将对每个df1
列进行操作的base ,并使所有列满足逻辑测试的功能:
Filter(function(u) any(c('wrong','correct') %in% u), df1)
# a c
#1 correct wrong
#2 wrong wrong
#3 wrong wrong
#4 correct wrong
Run Code Online (Sandbox Code Playgroud)
您还可以使用grepl
:
Filter(function(u) any(grepl('wrong|correct',u)), df1)
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
3130 次 |
最近记录: |