我有一个由数字和非数字列组成的数据框.
我想提取(子集)非数字列,所以字符为1.虽然我能够使用字符串对数字列进行子集化sub_num = x[sapply(x, is.numeric)],但我无法使用is.character表单执行相反的操作.谁能帮我?
sbh*_*bha 11
如果您尝试仅选择字符列,可以使用dplyr::select_if()和 来完成is.character()。以dplyr::starwars样本数据为例:
library(dplyr)
starwars %>%
select_if(is.character) %>%
head(2)
# A tibble: 2 x 7
name hair_color skin_color eye_color gender homeworld species
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 Luke Skywalker blond fair blue male Tatooine Human
2 C-3PO NA gold yellow NA Tatooine Droid
Run Code Online (Sandbox Code Playgroud)
或者,如果您尝试否定某种列类型,请注意语法略有不同:
starwars %>%
select_if(~!is.numeric(.)) %>%
head(2)
# A tibble: 2 x 10
name hair_color skin_color eye_color gender homeworld species films vehicles starships
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <list> <list> <list>
1 Luke Skywalker blond fair blue male Tatooine Human <chr [5]> <chr [2]> <chr [2]>
2 C-3PO NA gold yellow NA Tatooine Droid <chr [6]> <chr [0]> <chr [0]>
Run Code Online (Sandbox Code Playgroud)
好的,我做了一个关于我的想法的简短尝试.
我可以确认以下代码段正在运行:
str(d)
'data.frame': 5 obs. of 3 variables:
$ a: int 1 2 3 4 5
$ b: chr "a" "a" "a" "a" ...
$ c: Factor w/ 1 level "b": 1 1 1 1 1
# Get all character columns
d[, sapply(d, class) == 'character']
# Or, for factors, which might be likely:
d[, sapply(d, class) == 'factor']
# If you want to get both factors and characters use
d[, sapply(d, class) %in% c('character', 'factor')]
Run Code Online (Sandbox Code Playgroud)
使用正确的类,您的sapply-approach也应该起作用,至少只要您,在sapply函数之前插入缺失项.
!is.numeric如果您有不属于该组的类numeric, factor, character(POSIXct例如我经常使用的那个),那么使用的方法不能很好地扩展