Chr*_*z J 6 time allocation julia
我开始使用 Julia 主要是因为它的速度。目前,我正在解决一个定点问题。虽然我的代码的当前版本运行速度很快,但我想知道一些提高其速度的方法。
\n首先,让我总结一下算法。
\n为了实现它,我创建了两个结构:
\n struct Parm\n lC::Array{Float64, 2} # Lower limit\n uC::Array{Float64, 2} # Upper limit\n \xce\xb3::Float64 # CRRA coefficient\n \xce\xb4::Float64 # factor in the euler\n \xce\xb31::Float64 # \n r1::Float64 # inverse of the gross interest rate\n yb1::Array{Float64, 2} # y - b(t+1)\n P::Array{Float64, 2} # Transpose of transition matrix\n end\n\n mutable struct Upd1\n pol::Array{Float64,2} # policy function\n b::Array{Float64, 1} # exogenous grid for interpolation\n dif::Float64 # updating difference\n end\nRun Code Online (Sandbox Code Playgroud)\n第一个是一组参数,第二个存储决策规则C1。我还定义了一些函数:
\n function eulerm(x::Upd1,p::Parm)\n ct = p.\xce\xb4 *(x.pol.^(-p.\xce\xb3)*p.P).^(-p.\xce\xb31); #Euler equation \n bt = p.r1.*(ct .+ p.yb1); #Endeogenous grid for bonds\n return ct,bt\n end\n\n function interp0!(bt::Array{Float64},ct::Array{Float64},x::Upd1, p::Parm)\n polold = x.pol;\n polnew = similar(x.pol); \n @inbounds @simd for col in 1:size(bt,2)\n F1 = LinearInterpolation(bt[:,col], ct[:,col],extrapolation_bc=Line());\n polnew[:,col] = F1(x.b);\n end\n polnew[polnew .< p.lC] .= p.lC[polnew .< p.lC];\n polnew[polnew .> p.uC] .= p.uC[polnew .> p.uC];\n dif = maximum(abs.(polnew - polold));\n return polnew,dif\n end\n\n function updating!(x::Upd1,p::Parm)\n ct, bt = eulerm(x,p); # endogeneous grid\n x.pol, x.dif = interp0!(bt,ct,x,p);\n end\n\n function conver(x::Upd1,p::Parm)\n while x.dif>1e-8\n updating!(x,p);\n end\n end\nRun Code Online (Sandbox Code Playgroud)\n第一个公式实现步骤 2 和 3。第三个公式进行更新(步骤 4 的最后一部分),最后一个公式迭代直至收敛(步骤 5)。
\n最重要的功能是第二个。它进行插值。当我运行该函数时@time,@btime我意识到最大数量的分配位于该函数内的循环中。我尝试通过不定义polnew并直接进入来减少它x.pol,但在这种情况下,结果不正确,因为它只需要两次迭代即可收敛(我认为 Julia 认为这polold与x.pol同时)。
任何建议都会受到好评。
\n对于任何想要自己运行它的人,我添加了其余所需的代码:
\n function rouwen(\xcf\x81::Float64, \xcf\x832::Float64, N::Int64)\n if (N % 2 != 1)\n return "N should be an odd number"\n end\n sigz = sqrt(\xcf\x832/(1-\xcf\x81^2));\n zn = sigz*sqrt(N-1);\n z = range(-zn,zn,N);\n p = (1+\xcf\x81)/2;\n q = p;\n Rho = [p 1-p;1-q q];\n for i = 3:N\n zz = zeros(i-1,1);\n Rho = p*[Rho zz; zz' 0] + (1-p)*[zz Rho; 0 zz'] + (1-q)*[zz' 0; Rho zz] + q *[0 zz'; zz Rho];\n Rho[2:end-1,:] = Rho[2:end-1,:]/2;\n end\n return z,Rho;\n end\n\n #############################################################\n # Parameters of the model\n ############################################################\n\n lb = 0; ub = 1000; pivb = 0.25; nb = 500; \n \xcf\x81 = 0.988; \xcf\x83z = 0.0439; \xce\xbcz =-\xcf\x83z/2; nz = 7;\n \xcf\x95 = 0.0; \xcf\x83e = 0.6376; \xce\xbce =-\xcf\x83e/2; ne = 7; \n \xce\xb2 = 0.98; r = 1/400; \xce\xb3 = 1; \n\n b = exp10.(range(start=log10(lb+pivb), stop=log10(ub+pivb), length=nb)) .- pivb;\n\n #=========================================================\n Algorithm\n ======================================================== =#\n (z,Pz) = rouwen(\xcf\x81,\xcf\x83z, nz);\n \xce\xbcZ = \xce\xbcz/(1-\xcf\x81);\n z = z .+ \xce\xbcZ;\n (ee,Pe) = rouwen(\xcf\x95,\xcf\x83e,ne);\n ee = ee .+ \xce\xbce;\n y = exp.(vec((z .+ ee')'));\n P = kron(Pz,Pe);\n R = 1 + r;\n r1 = R^(-1);\n \xce\xb31 = 1/\xce\xb3;\n \xce\xb4 = (\xce\xb2*R)^(-\xce\xb31);\n m = R*b .+ y';\n lC = max.(m .- ub,0);\n uC = m .- lb;\n by1 = b .- y';\n\n # initial guess for C0\n c0 = 0.1*(m); \n # Set of parameters\n pp = Parm(lC,uC,\xce\xb3,\xce\xb4,\xce\xb31,r1,by1,P');\n # Container of results\n up1 = Upd1(c0,b,1);\n # Fixed point problem\n conver(up1,pp)\nRun Code Online (Sandbox Code Playgroud)\n更新按照建议,我对第三个函数进行了以下更改
\n function interp0!(bt::Array{Float64},ct::Array{Float64},x::Upd1, p::Parm)\n polold = x.pol;\n polnew = similar(x.pol); \n @inbounds for col in 1:size(bt,2)\n F1 = LinearInterpolation(@view(bt[:,col]), @view(ct[:,col]),extrapolation_bc=Line());\n polnew[:,col] = F1(x.b);\n end\n for j in eachindex(polnew)\n polnew[j] < p.lC[j] ? polnew[j] = p.lC[j] : nothing\n polnew[j] > p.uC[j] ? polnew[j] = p.uC[j] : nothing\n end \n dif = maximum(abs.(polnew - polold));\n return polnew,dif\n end\nRun Code Online (Sandbox Code Playgroud)\n这导致速度提高(从约 1.5 秒到约 1.3 秒)。并减少分配数量。我注意到的一些事情是:
\npolnew[:,col] = F1(x.b)可以polnew[:,col] .= F1(x.b)减少总分配量,但时间变慢,这是为什么呢? up1 = Upd1(c0,b,1);\n @time conver(up1,pp)\n 1.338042 seconds (385.72 k allocations: 1.157 GiB, 3.37% gc time)\n up1 = Upd1(c0,b,1);\n @btime conver(up1,pp)\n 4.200 ns (0 allocations: 0 bytes)\nRun Code Online (Sandbox Code Playgroud)\n准确地说,在这两种情况下,我都运行了几次,并为每行选择了代表数字。\n这是否意味着编译期间所有时间都应分配?
\n开始按照 @DNF 建议的“性能提示”进行操作,但在下面您将找到针对您的代码的最重要的注释。
julia> julia> a = rand(3,4);
julia> @btime $a[3,:] = $a[3,:] ./ 2;
40.726 ns (2 allocations: 192 bytes)
julia> @btime $a[3,:] .= $a[3,:] ./ 2;
20.562 ns (1 allocation: 96 bytes)
Run Code Online (Sandbox Code Playgroud)
julia> @btime sum($a[3,:]);
18.719 ns (1 allocation: 96 bytes)
julia> @btime sum(@view($a[3,:]));
5.600 ns (0 allocations: 0 bytes)
Run Code Online (Sandbox Code Playgroud)
当您对每个元素进行循环时,围绕一行的代码 polnew[polnew .< p.lC] .= p.lC[polnew .< p.lC];将进行更少的分配forpolnew
当代码调用复杂的外部函数时,@simd 对条件语句(第 3 点)没有影响