我有一个关于组合的问题.
我的迷你样本看起来像这样:
sample <- data.frame(
group=c("a","a","a","a","b","b","b"),
number=c(1,2,3,2,4,5,3)
)
Run Code Online (Sandbox Code Playgroud)
如果我将函数应用于combn数据框,它会给出以下结果,即"数字"列下的值的所有组合,无论该值属于哪个组:
[,1] [,2]
[1,] 1 2
[2,] 1 3
[3,] 1 2
[4,] 1 4
[5,] 1 5
[6,] 1 3
[7,] 2 3
[8,] 2 2
[9,] 2 4
[10,] 2 5
[11,] 2 3
[12,] 3 2
[13,] 3 4
[14,] 3 5
[15,] 3 3
[16,] 2 4
[17,] 2 5
[18,] 2 3
[19,] 4 5
[20,] 4 3
[21,] 5 3
Run Code Online (Sandbox Code Playgroud)
我用于上述结果的代码如下:
t(combn((sample$number), 2))
Run Code Online (Sandbox Code Playgroud)
但是,我想在组内得到组合结果(即"a","b").因此,我想得到的结果应如下所示:
[,1] [,2] [,3]
[1,] a 1 2
[2,] a 1 3
[3,] a 1 2
[4,] a 2 3
[5,] a 2 2
[6,] a 3 2
[7,] b 4 5
[8,] b 4 3
[9,] b 5 3
Run Code Online (Sandbox Code Playgroud)
除了组合,我想得到指示该组的列.
我们可以使用 group by 函数data.table
library(data.table)
setDT(sample)[, {i1 <- combn(number, 2)
list(i1[1,], i1[2,]) }, by = group]
# group V1 V2
#1: a 1 2
#2: a 1 3
#3: a 1 2
#4: a 2 3
#5: a 2 2
#6: a 3 2
#7: b 4 5
#8: b 4 3
#9: b 5 3
Run Code Online (Sandbox Code Playgroud)
或者一个紧凑的选择是
setDT(sample)[, transpose(combn(number, 2, FUN = list)), by = group]
Run Code Online (Sandbox Code Playgroud)
或者使用base R
lst <- by(sample$number, sample$group, FUN = combn, m= 2)
data.frame(group = rep(unique(as.character(sample$group)),
sapply(lst, ncol)), t(do.call(cbind, lst)))
Run Code Online (Sandbox Code Playgroud)