M.T*_*.T. 4 foreach split r dataframe doparallel
This is a very simple example.
df = c("already ","miss you","haters","she's cool")
df = data.frame(df)
library(doParallel)
cl = makeCluster(4)
registerDoParallel(cl)
foreach(i = df[1:4,1], .combine = rbind, .packages='tm') %dopar% classification(i)
stopCluster(cl)
Run Code Online (Sandbox Code Playgroud)
In real case I have dataframe with n=400000 rows. I don't know how to send nrow/ncluster data for each cluster in one step, i = ?
I tried with isplitRows from library(itertools) without success.
您应该尝试使用索引来创建数据的子集。
foreach(i = nrow(df), .combine = rbind, .packages='tm') %dopar% {
tmp <- df[i, ]
classification(tmp)
}
Run Code Online (Sandbox Code Playgroud)
这将需要data.frame每次迭代的新行。
此外,您应该注意到 foreach 循环的结果将写入一个新变量。因此,您应该像这样分配它:
res <- foreach(i = 1:10, .combine = c, ....) %dopar% {
# things you want to do
x <- someFancyFunction()
# the last value will be returned and combined by the .combine function
x
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3215 次 |
| 最近记录: |