How can I run a simple parallel array assignment operation in Julia?

Question

How can I run a simple parallel array assignment operation in Julia?

hg1*_*153 2 parallel-processing differential-equations julia differentialequations.jl

I have to solve a differential equations system many times, iterating over a parameter. For this, I run a loop over a list of the parameter, and store the solution (evaluated at an array of time values) for each parameter. So I have a 2D array in which I store solutions (each row is for a value of the parameter).

Now, since any iteration has nothing to do with another one, I thought of doing this in parallel.

Here is my code:

using DifferentialEquations
using SharedArrays
using DelimitedFiles
using Distributed

function tf(x,w)
    return x*sin(w*x)
end

function sys!(dv,v,w,t)
    dv[1] = w*v[1]
    dv[2] = tf(v[1],w)
end

times = LinRange(0.1,2,25)

params = LinRange(0.1,1.2,100)

sols = SharedArray{Float64,2}((length(times),length(params)))

@distributed for i=1:length(params)
    println(i)
    init_val = [1.0,1.0]
    tspan = (0.0,2.0)
    prob = ODEProblem(sys!,init_val,tspan,params[i])
    sol = solve(prob)
    sols[:,i] .= sol(times)[2,:]
end

writedlm("output.txt",sols)

Run Code Online (Sandbox Code Playgroud)

Now, when I run this without the @distributed prefixed to the loop, this runs perfectly.

When I run this code, however, the println statement does not work, and although the file "output.txt" is stored, it is full of zeros.

I'm running this code from the command line this way

julia -p 4 trycode.jl

Run Code Online (Sandbox Code Playgroud)

This shows no output and just works for a minute and does nothing, although the file "output.txt" is stored. It's as if the loop is never entered.

I would really appreciate some help on how to set up this simple parallel loop.

Answer 1

Nil*_*dat 6

正如 Bill 所说，考虑 Julia 中的并行性有两种主要方式：线程模型，它在 Julia 1.3 中引入并通过Threads.@threads宏进行共享内存并行，以及使用Distributed.@distributed宏进行分布式处理，它在不同的 Julia 进程之间并行化。

线程绝对更接近“自动”并行加速，只需最少或无需重写代码，并且通常是一个不错的选择，尽管必须注意确保正在运行的任何操作都是线程安全的，因此请始终检查结果是一样的。

由于您的问题@distributed最初是关于并行性的，所以让我也回答一下。如果您进行@distributed并行处理，考虑正在发生的事情的最简单的思维模型（我相信）是想象您正在完全独立的 Julia REPL 中运行您的代码。

这是适合@distributed模型的代码版本：

using Distributed
addprocs(2)

using SharedArrays
using DelimitedFiles

@everywhere begin 
    using DifferentialEquations

    tf(x,w) = x*sin(w*x)

    function sys!(dv,v,w,t)
        dv[1] = w*v[1]
        dv[2] = tf(v[1],w)
    end

    times = LinRange(0.1,2,25)
    params = LinRange(0.1,1.2,100)
end

sols = SharedArray{Float64,2}((length(times),length(params)))

@sync @distributed for i=1:length(params)
    println(i)
    init_val = [1.0,1.0]
    tspan = (0.0,2.0)
    prob = ODEProblem(sys!,init_val,tspan,params[i])
    sol = solve(prob)
    sols[:,i] .= sol(times)[2,:]
end

sols

Run Code Online (Sandbox Code Playgroud)

发生了什么变化？

我addprocs(2)在脚本的开头添加了。如果您在启动 Julia 时使用p -2（或您想要的任何数量的进程），这不是必需的，但是我经常发现当它直接在代码中显式设置并行环境时，更容易推理代码。请注意，这对于线程目前是不可能的，即您需要JULIA_NUM_THREADS在启动 Julia 之前设置环境变量，并且一旦启动并运行就无法更改线程数。
然后我将代码的一部分移动到一个@everywhere begin ... end块中。这实质上是同时在所有进程上运行包含在块中的操作。回到运行单独 Julia 实例的思维模型，您必须查看@distributed循环中的内容，并确保所有函数和变量实际上都在所有进程上定义。例如，为了确保每个进程都知道是什么ODEProblem，您需要using DifferentialEquations对所有进程执行操作。
最后，我添加@sync到分布式循环中。这在@distributed. @distributed使用for循环运行宏会Task为分布式执行生成一个异步绿色线程 ( ) 句柄并前进到下一行。由于您想等到执行实际完成，@sync因此需要同步。您原始代码的问题在于，无需等待绿色线程完成（同步），它就会吞下错误并立即返回，这就是您的sol数组为空的原因。如果您运行原始代码，您可以看到这一点，并且只添加@sync- 然后您将获得TaskFailedException: on worker 2 - UndefVarError: #sys! not defined这告诉你你的工作进程不知道你在主进程上定义的函数。在实践中，您几乎总是希望@sync执行，除非您计划并行运行许多这样的分布式循环。您也不需要@sync在分布式循环中使用聚合器函数的关键字（循环的@distributed (func) for i in 1:1000形式）

现在最好的解决方案是什么？答案是我不知道。@threads是无需重写代码即可快速并行化线程安全操作的绝佳选择，并且仍在积极开发和改进中，因此将来可能会变得更好。还有pmap在它给你额外的选项分布式标准库，但这个答案是足够长的时间，因为它是！根据我的个人经验，没有什么可以取代 (1) 考虑您的问题和 (2) 基准测试执行。您要考虑的事情是问题的运行时间（总的和您要分发的每个单独的操作）和消息传递/内存访问要求。

好处是，虽然您可能需要花费一些精力来思考问题，但 Julia 有很多不错的选择，可以从一台带有两个内核的破旧笔记本电脑（例如我正在打字的笔记本电脑）中充分利用每种硬件情况这从）到多节点超高性能集群（这使 Julia 成为为数不多的实现 petaflop 性能的编程语言之一- 尽管公平地说，这比我或 Bill 的答案更棘手:)）

不幸的是，“@distributed”为所有任务分配了完全相同的计算次数。因此，如果你的计算时间不平衡，这可能不是一个好方法（除非你有很多工作并且你有很好的机会平衡）。 (2认同)
关于“Thread”或“@distributed”哪个更好 - 实际上，当线程太多时，Julia 线程的扩展性会很差。因此，如果您使用 2 或 4 个核心，则可以使用线程和“@distributed”，如果这是 64 个核心，您肯定需要“@distributed” (2认同)

归档时间：	5 年，10 月前
查看次数：	632 次
最近记录：	5 年，10 月前