如何并行化combn()?

Gaë*_*ano 5 parallel-processing combinations r combinatorics

函数combn()一次生成m个元素的所有组合.对于nCm small来说非常快速有效(其中n是x的元素数),但它很快就耗尽了内存.例如:

> combn(c(1:50), 12, simplify = TRUE)
Error in matrix(r, nrow = len.r, ncol = count) : 
invalid 'ncol' value (too large or NA)
Run Code Online (Sandbox Code Playgroud)

我想知道函数combn()是否可以修改,以便它只生成k个选择的组合.让我们将这个新函数调用为combn().然后我们会:

> combn(c("a", "b", "c", "d"), m=2)
     [,1] [,2] [,3] [,4] [,5] [,6]
 [1,] "a"  "a"  "a"  "b"  "b"  "c" 
 [2,] "b"  "c"  "d"  "c"  "d"  "d" 

>chosencombn(c("a", "b", "c", "d"), m=2, i=c(1,4,6))
     [,1] [,2] [,3]
 [1,] "a"  "b"  "c" 
 [2,] "b"  "c"  "d"

>chosencombn(c("a", "b", "c", "d"), m=2, i=c(4,5))
     [,1] [,2]
 [1,] "b"  "b" 
 [2,] "c"  "d" 
Run Code Online (Sandbox Code Playgroud)

我知道这样的功能需要使用组合的排序,以便可以立即找到给定组合的位置.这样的排序是否存在?可以编码以获得与combn()一样有效的函数吗?

ale*_*laz 1

“trotter”对此很有用,因为它不会将排列保留在内存中。

library(trotter)

combs = cpv(2, c("a", "b", "c", "d"))
sapply(c(1, 4, 6), function(i) combs[i])
#     [,1] [,2] [,3]
#[1,] "a"  "b"  "c" 
#[2,] "b"  "c"  "d"
Run Code Online (Sandbox Code Playgroud)