R: how to split dataframe in foreach %dopar%

M.T*_*.T. 4 foreach split r dataframe doparallel

This is a very simple example.

df = c("already ","miss you","haters","she's cool")
df = data.frame(df)

library(doParallel)
cl = makeCluster(4)
registerDoParallel(cl)    
foreach(i = df[1:4,1], .combine = rbind, .packages='tm')  %dopar% classification(i)
stopCluster(cl)
Run Code Online (Sandbox Code Playgroud)

In real case I have dataframe with n=400000 rows. I don't know how to send nrow/ncluster data for each cluster in one step, i = ?

I tried with isplitRows from library(itertools) without success.

lok*_*oki 6

您应该尝试使用索引来创建数据的子集。

foreach(i = nrow(df), .combine = rbind, .packages='tm')  %dopar% {
  tmp <- df[i, ]
  classification(tmp)
}
Run Code Online (Sandbox Code Playgroud)

这将需要data.frame每次迭代的新行。

此外,您应该注意到 foreach 循环的结果将写入一个新变量。因此,您应该像这样分配它:

res <- foreach(i = 1:10, .combine = c, ....) %dopar% {
  # things you want to do
  x <- someFancyFunction()

  # the last value will be returned and combined by the .combine function
  x 
}
Run Code Online (Sandbox Code Playgroud)