Julia: vectorized vs. devectorized code

Ana*_*sid 3 arrays for-loop vectorization julia

From what I understood, Julia is supposed to make for loops faster and as fast as vectorized operations. I wrote three versions of a simple function that finds distance using for loops vs. a vectorized operation vs. the latter with DataFrames:

x = rand(500)
y = rand(500)
a = rand()
b = rand()

function devect()
    dist = Array(Float64, 0)
    twins = Array(Float64, 0,2)

    for i in 1:500
        dist = [dist; sqrt((x[i] - a)^2 + (y[i] - b)^2)]
        if dist[end] < 0.05
            twins = [twins; [x y][end,:]]
        end
    end

    return twins
end

function vect()
    d = sqrt((x-a).^2 + (y-b).^2)
    return [x y][d .< 0.05,:]
end

using DataFrames

function df_vect()
    df = DataFrame(x=x, y=y)
    dist = sqrt((df[:x]-a).^2 + (df[:y]-b).^2)

    return df[dist .< 0.05,:]
end

n = 10^3

@time for i in [1:n] devect() end
@time for i in [1:n] vect() end
@time for i in [1:n] df_vect() end
Run Code Online (Sandbox Code Playgroud)

Output:

elapsed time: 4.308049576 seconds (1977455752 bytes allocated, 24.77% gc time)
elapsed time: 0.046759167 seconds (37295768 bytes allocated, 54.36% gc time)
elapsed time: 0.052463997 seconds (30359752 bytes allocated, 49.44% gc time)
Run Code Online (Sandbox Code Playgroud)

Why does the vectorized version perform so much faster?

Ste*_*ski 11

http://julia.readthedocs.org/en/latest/manual/performance-tips/#avoid-global-variables

你的代码在任何地方使用非常量全局变量,这意味着你基本上回到了解释语言的性能领域,因为在编译时不能保证它们的类型.要快速加速,只需为所有全局变量赋值前缀const.


ptb*_*ptb 7

跟进我对用于在devect中构建解决方案的方法的评论.这是我的代码

julia> x, y, a, b = rand(500), rand(500), rand(), rand()

julia> function devect{T}(x::Vector{T}, y::Vector{T}, a::T, b::T)
       res = Array(T, 0)
       dim1 = 0
       for i = 1:size(x,1)
           if sqrt((x[i]-a)^2+(y[i]-b)^2) < 0.05
               push!(res, x[i])
               push!(res, y[i])
               dim1 += 1
           end
       end
       reshape(res, (2, dim1))'
   end
devect (generic function with 1 method)

julia> function vect{T}(x::Vector{T}, y::Vector{T}, a::T, b::T)
       d = sqrt((x-a).^2+(y-b).^2)
       [x y][d.<0.05, :]
   end
vect (generic function with 1 method)

julia> @time vect(x, y, a, b)
elapsed time: 3.7118e-5 seconds (37216 bytes allocated)
2x2 Array{Float64,2}:
 0.978099  0.0405639
 0.94757   0.0224974

julia> @time vect(x, y, a, b)
elapsed time: 7.1977e-5 seconds (37216 bytes allocated)
2x2 Array{Float64,2}:
 0.978099  0.0405639
 0.94757   0.0224974

julia> @time devect(x, y, a, b)
elapsed time: 1.7146e-5 seconds (376 bytes allocated)
2x2 Array{Float64,2}:
 0.978099  0.0405639
 0.94757   0.0224974

julia> @time devect(x, y, a, b)
elapsed time: 1.3065e-5 seconds (376 bytes allocated)
2x2 Array{Float64,2}:
 0.978099  0.0405639
 0.94757   0.0224974

julia> @time devect(x, y, a, b)
elapsed time: 1.8059e-5 seconds (376 bytes allocated)
2x2 Array{Float64,2}:
 0.978099  0.0405639
 0.94757   0.0224974
Run Code Online (Sandbox Code Playgroud)

可能有更快的方法来执行devect解决方案但注意分配的字节差异.如果一个devectorized解决方案分配的内存比矢量化解决方案多,那么它可能是错误的(至少在Julia中).