I would like to parallelize a loop like
td <- data.frame(cbind(c(rep(1,4),2,rep(1,5)),rep(1:10,2)))
names(td) <- c("val","id")
res <- rep(NA,NROW(td))
for(i in levels(interaction(td$id))){
res[td$id==i] <- mean(td$val[td$id!=i])
}
Run Code Online (Sandbox Code Playgroud)
with the help of foreach() of the library(doParallel) in order to speed up computations. Unfortunately foreach doesn't seem to support direct assignments, at least
registerDoParallel(4)
res <- rep(NA,NROW(td))
foreach(i=levels(interaction(td$id))) %dopar%{
res[td$id==i] <- mean(td$val[td$id!=i])}
Run Code Online (Sandbox Code Playgroud)
不做我想要的(给出与上面的正常循环相同的结果).任何想法我做错了什么或我怎么能以某种方式"破解" foreach中的.combine选项以便做我想要的?请注意,id变量的顺序在原始数据集中并不总是相同.任何提示都将非常感谢!
桌上大师!这是我几天前在这里发布的一个问题的后续问题.我正在向您提出一个问题,如何提高以下应用程序的性能data.table:
功能(为了设想尽可能最快的目的):
prob <- function(a, ie1, b, a1, ie2, b2, ...) {
ipf <- function(a, b, ...) {
m <- length(a)
n <- length(b)
if (m < n) {
r <- rank(c(a, b), ...)[1:m] - 1:m
} else {
r <- rank(c(a, b), ...)[(m + 1):(m + n)] - 1:n
}
s <- ifelse((n + m)^2 > 2^31, sum(as.double(r)), sum(r))/(as.double(m) * n)
return(ifelse(m < n, s, 1 - s))
}
expand.grid.alt <- function(seq1, seq2) {
cbind(rep.int(seq1, …Run Code Online (Sandbox Code Playgroud)