Syl*_*uez 0 r subset dataframe dplyr
我有以下代码,并且想将列选择到新的data.frame.
library(dplyr)
df = data.frame(
Manhattan=c(1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0),
Brooklyn=c(0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0),
The_Bronx=c(1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0),
Staten_Island=c(0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0),
"2012"=c("P", "P", "P", "P", "P", "P", "P", "P", "P", "P", "Q", "Q", "Q", "Q", "Q", "Q", "Q", "Q", "Q"),
"2013"=c("P", "P", "P", "P", "P", "P", "P", "P", "Q", "Q", "P", "P", "P", "P", "Q", "Q", "Q", "Q", "Q"),
"2014"=c("P", "P", "P", "Q", "Q", "P", "P", "Q", "Q", "Q", "Q", "Q", "P", "Q", "P", "P", "P", "Q", "Q"),
"2015"=c("P", "P", "P", "P", "P", "Q", "Q", "Q", "P", "Q", "P", "P", "Q", "Q", "Q", "Q", "Q", "Q", "Q"), check.names=FALSE)
df2 <- subset(df, select = c("Manhattan", "Queens", "The_Bronx"))
Run Code Online (Sandbox Code Playgroud)
这会引发错误:
Error in [.data.frame`(x, r, vars, drop = drop) :
undefined columns selected
Run Code Online (Sandbox Code Playgroud)
因为 中缺少“Queens”列df。我怎样才能覆盖该错误,以便 R 继续创建仅包含“Manhattan”和“The_Bronx”列的 df2 ?
非常重要:我的真实数据有数百列,因此无法从命令中手动删除“Queens”等列df2 <- subset(df, select = c("Manhattan", "Queens", "The_Bronx"))(除非有技巧?)。有办法解决这个问题吗?谢谢。
在基本 R 中,您可以使用intersect仅选择存在的名称。
cols <- c("Manhattan", "Queens", "The_Bronx")
subset(df, select = intersect(names(df), cols))
# Manhattan The_Bronx
#1 1 1
#2 1 1
#3 0 0
#4 1 0
#5 1 0
#6 1 0
#7 1 0
#8 0 0
#...
#....
Run Code Online (Sandbox Code Playgroud)
或者用any_of在dplyr:
library(dplyr)
df %>% select(tidyselect::any_of(cols))
Run Code Online (Sandbox Code Playgroud)