在 R 中运行并行计算时如何在工作线程上设置 .libPaths(检查点)

nee*_*elp 3 parallel-processing foreach r checkpoint r-future

我使用检查点包进行可重复的数据分析。有些计算需要很长时间才能计算,所以我想并行运行它们。然而,当并行运行时,检查点未在工作线程上设置,因此我收到一条错误消息“没有名为 xy 的包”(因为它没有安装在我的默认库目录中)。

我如何确保每个工作人员都使用检查点文件夹中的包版本?我尝试在 foreach 代码中设置 .libPaths 但这似乎不起作用。我还希望在全局范围内设置检查点/libPaths 一次,而不是在每个 foreach 调用中设置一次。

另一种选择可能是更改 .Rprofile 文件,但我不想这样做。

checkpoint::checkpoint("2018-06-01")

library(foreach)
library(doFuture)
library(future)

doFuture::registerDoFuture()
future::plan("multisession")

l <- .libPaths()

# Code to run in parallel does not make much sense of course but I wanted to keep it simple.
res <- foreach::foreach(
  x = unique(iris$Species),
  lib.path = l
) %dopar% {
  .libPaths(lib.path)
  stringr::str_c(x, "_")
}
Run Code Online (Sandbox Code Playgroud)

{ 中的错误:任务 2 失败 - “没有名为‘stringr’的包”

Hen*_*ikB 6

未来的作者在这里。

\n

更新 2022-05-25:从未来1.20.0 (2021-11-03) 开始,多会话并行工作线程自动继承 R 库路径 (=.libPaths()并行工作线程自动从主 R 会话因此,不再需要以下解决方法。然而,未来的其他后端可能仍然需要它。

\n
\n

将 R 主进程的库路径作为全局变量传递libs并为每个工作进程设置它就.libPaths(libs)足够了;

\n
## Use CRAN checkpoint from 2018-07-24 to get future (>= 1.9.0) [1],\n## otherwise the below stdout won\'t be relayed back to the master\n## R process, but settings .libPaths() does also work in older\n## versions of the future package.\n## [1] https://cran.microsoft.com/snapshot/2018-07-24/web/packages/future\ncheckpoint::checkpoint("2018-07-24")\nstopifnot(packageVersion("future") >= "1.9.0")\n\nlibs <- .libPaths()\nprint(libs)\n### [1] "/home/hb/.checkpoint/2018-07-24/lib/x86_64-pc-linux-gnu/3.5.1"\n### [2] "/home/hb/.checkpoint/R-3.5.1"                                 \n### [3] "/usr/lib/R/library"\n\nlibrary(foreach)\n\ndoFuture::registerDoFuture()\nfuture::plan("multisession")\n\nres <- foreach::foreach(x = unique(iris$Species)) %dopar% {\n  ## Use the same library paths as the master R session\n  .libPaths(libs)\n  \n  cat(sprintf("Library paths used by worker (PID %d):\\n", Sys.getpid()))\n  cat(sprintf(" - %s\\n", sQuote(.libPaths())))\n      \n  stringr::str_c(x, "_")\n}\n\n###  - \xe2\x80\x98/home/hb/.checkpoint/2018-07-24/lib/x86_64-pc-linux-gnu/3.5.1\xe2\x80\x99\n###   - \xe2\x80\x98/home/hb/.checkpoint/R-3.5.1\xe2\x80\x99\n###   - \xe2\x80\x98/usr/lib/R/library\xe2\x80\x99\n### Library paths used by worker (PID 9394):\n###  - \xe2\x80\x98/home/hb/.checkpoint/2018-07-24/lib/x86_64-pc-linux-gnu/3.5.1\xe2\x80\x99\n###   - \xe2\x80\x98/home/hb/.checkpoint/R-3.5.1\xe2\x80\x99\n###   - \xe2\x80\x98/usr/lib/R/library\xe2\x80\x99\n### Library paths used by worker (PID 9412):\n###  - \xe2\x80\x98/home/hb/.checkpoint/2018-07-24/lib/x86_64-pc-linux-gnu/3.5.1\xe2\x80\x99\n###   - \xe2\x80\x98/home/hb/.checkpoint/R-3.5.1\xe2\x80\x99\n###   - \xe2\x80\x98/usr/lib/R/library\xe2\x80\x99\n\nstr(res)\n### List of 3\n###  $ : chr "setosa_"\n###  $ : chr "versicolor_"\n###  $ : chr "virginica_"\n
Run Code Online (Sandbox Code Playgroud)\n

仅供参考,未来的路线图是为了更容易地将库路径传递给工人

\n

我的细节:

\n
> sessionInfo()\nR version 3.5.1 (2018-07-02)   \nPlatform: x86_64-pc-linux-gnu (64-bit)   \nRunning under: Ubuntu 18.04.1 LTS   \n\nMatrix products: default   \nBLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1   \nLAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1   \n  \nlocale:   \n [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8           LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   \n [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                     LC_ADDRESS=C               LC_TELEPHONE=C            \n[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C          \n  \nattached base packages:   \n[1] stats     graphics  grDevices utils     datasets  methods   base        \n  \nother attached packages:   \n[1] foreach_1.4.4   \n  \nloaded via a namespace (and not attached):   \n[1] drat_0.1.4         compiler_3.5.1     BiocManager_1.30.2 parallel_3.5.1        tools_3.5.1        listenv_0.7.0      doFuture_0.6.0    \n[8] codetools_0.2-15   iterators_1.0.10   digest_0.6.15      globals_0.12.1        checkpoint_0.4.5   future_1.9.0 \n
Run Code Online (Sandbox Code Playgroud)\n