按列计算组合,顺序无关紧要

See*_*ata 4 combinations r

dat <- data.frame(A = c("r","t","y","g","r"),
                  B = c("g","r","r","t","y"),
                  C = c("t","g","t","r","t"))

  A B C
1 r g t
2 t r g
3 y r t
4 g t r
5 r y t
Run Code Online (Sandbox Code Playgroud)

我想列出三列中一起出现的字符,忽略顺序。例如

Combinations  Freq
r t g         3
y t r         2
Run Code Online (Sandbox Code Playgroud)

如果我想添加名义变量(例如性别)的频率计数,我该怎么做?

例如

dat <- data.frame(A = c("r","t","y","g","r"),
                  B = c("g","r","r","t","y"),
                  C = c("t","g","t","r","t"),
             Gender = c("male", "female", "female", "male", "male"))

dat

  A B C Gender
1 r g t   male
2 t r g female
3 y r t female
4 g t r   male
5 r y t   male
Run Code Online (Sandbox Code Playgroud)

要得到这个:

Combinations  Freq   Male   Female
r t g         3      2       1
y t r         2      1       1
Run Code Online (Sandbox Code Playgroud)

Fra*_*ank 5

你可以做...

data.frame(table(combo = sapply(split(as.matrix(dat), row(dat)), 
  function(x) paste(sort(x), collapse=" "))))

  combo Freq
1 g r t    3
2 r t y    2
Run Code Online (Sandbox Code Playgroud)

为了便于阅读,我建议多行和/或使用 magrittr 进行操作:

d = as.matrix(dat)
library(magrittr)

d %>% split(., row(.)) %>% sapply(
  . %>% sort %>% paste(collapse = " ")
) %>% table(combo = .) %>% data.frame

  combo Freq
1 g r t    3
2 r t y    2
Run Code Online (Sandbox Code Playgroud)

关于编辑/新问题,我会采取一些不同的方法,也许像......

# new example data
dat <- data.frame(A = c("r","t","y","g","r"), B = c("g","r","r","t","y"), C = c("t","g","t","r","t"),Gender = c("male", "female", "female", "male", "male"))

library(data.table)
setDT(dat)

dat[, combo := sapply(transpose(.SD), 
  . %>% sort %>% paste(collapse = " ")), .SDcols=A:C]

dat[, c(
  n = .N, 
  Gender %>% factor(levels=c("male", "female")) %>% table %>% as.list
), by=combo]

   combo n male female
1: g r t 3    2      1
2: r t y 2    1      1
Run Code Online (Sandbox Code Playgroud)