如何将字符串拆分为具有1/0值标志的列向量?

gh0*_*r18 13 r

我有这样的角色矢量:

a <- c("a,b,c", "a,b", "a,b,c,d")

我想要做的是创建一个如下所示的数据框:

   a    b    c    d
1] 1    1    1    0
2] 1    1    0    0
3] 1    1    1    1
Run Code Online (Sandbox Code Playgroud)

我有一种感觉,我需要使用的某种组合read.tablereshape,但我真的很挣扎.任何和帮助赞赏.

A5C*_*2T1 14

您可以尝试cSplit_e我的"splitstackshape"包:

library(splitstackshape)
a <- c("a,b,c", "a,b", "a,b,c,d")
cSplit_e(as.data.table(a), "a", ",", type = "character", fill = 0)
#          a a_a a_b a_c a_d
# 1:   a,b,c   1   1   1   0
# 2:     a,b   1   1   0   0
# 3: a,b,c,d   1   1   1   1
cSplit_e(as.data.table(a), "a", ",", type = "character", fill = 0, drop = TRUE)
#    a_a a_b a_c a_d
# 1:   1   1   1   0
# 2:   1   1   0   0
# 3:   1   1   1   1
Run Code Online (Sandbox Code Playgroud)

还有mtabulate"qdapTools":

library(qdapTools)
mtabulate(strsplit(a, ","))
#   a b c d
# 1 1 1 1 0
# 2 1 1 0 0
# 3 1 1 1 1
Run Code Online (Sandbox Code Playgroud)

一个很直接的基础R的方法是使用table沿着stackstrsplit:

table(rev(stack(setNames(strsplit(a, ",", TRUE), seq_along(a)))))
#    values
# ind a b c d
#   1 1 1 1 0
#   2 1 1 0 0
#   3 1 1 1 1
Run Code Online (Sandbox Code Playgroud)

  • @Frank,除了让它更快***之外什么都不做. (2认同)

Fra*_*ank 8

另一个复杂的基础R解决方案:

x  <- strsplit(a,",")
xl <- unique(unlist(x))

t(sapply(x,function(z)table(factor(z,levels=xl))))
Run Code Online (Sandbox Code Playgroud)

这使

     a b c d
[1,] 1 1 1 0
[2,] 1 1 0 0
[3,] 1 1 1 1
Run Code Online (Sandbox Code Playgroud)


Ric*_*ven 5

另一种选择tstrsplit()来自:

library(data.table)
vapply(tstrsplit(a, ",", fixed = TRUE, fill = 0), ">", integer(length(a)), 0L)
#      [,1] [,2] [,3] [,4]
# [1,]    1    1    1    0
# [2,]    1    1    0    0
# [3,]    1    1    1    1
Run Code Online (Sandbox Code Playgroud)