通过实例列表扩展向量列表 - Julia

Fra*_*art 0 expand julia

我想通过包含每个实例的数量的向量来扩展值的向量.我已经提出了以下代码来完成这项工作,但似乎这是一个常见的用途,所以我可能错过了一些东西.

valuelist = ["a","b","d","z"]
numberofinstance = [3,5,1,11]

valuevector = String[]
for i in 1:length(numberofinstance) 
  append!(valuevector , repeat([valuelist[i]], numberofinstance[i])) 
end
Run Code Online (Sandbox Code Playgroud)

crs*_*nbr 5

如果使用包(基本上是stdlib)没问题,那么inverse_rleStatsBase.jl中调用您要查找的函数:

julia> using StatsBase

julia> inverse_rle(valuelist, numberofinstance)
20-element Array{String,1}:
 "a"
 "a"
 "a"
 "b"
 "b"
 "b"
 "b"
 "b"
 "d"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"
 "z"

julia> @btime inverse_rle($valuelist, $numberofinstance);
  76.799 ns (1 allocation: 240 bytes)

julia> @btime yoursolution($valuelist, $numberofinstance);
  693.329 ns (13 allocations: 1.55 KiB)
Run Code Online (Sandbox Code Playgroud)

如果你想避免包裹,你原则上可以广播repeat^(电力)这样,

vcat(collect.(.^(valuelist, numberofinstance))...)

但我认为这是比较难以解析,也慢inverse_rle,

julia> @btime yoursolution($valuelist, $numberofinstance);
  693.329 ns (13 allocations: 1.55 KiB)

julia> @btime vcat(collect.(.^($valuelist, $numberofinstance))...)
  472.615 ns (9 allocations: 800 bytes)
Run Code Online (Sandbox Code Playgroud)

但是,由于Julia允许您编写快速循环,因此您可以轻松定义自己的简单函数.以下内容比您的解决方案快得多(与实现StatsBase速度一样快):

function multiply(vs, ns)
   r = Vector{String}(undef, sum(ns))
   c = 1
   @inbounds for i in axes(ns, 1)
       for k in 1:ns[i]
           r[c] = vs[i]
           c += 1
       end
   end
   r
end
Run Code Online (Sandbox Code Playgroud)

基准测试:

julia> @btime yoursolution($valuelist, $numberofinstance);
  693.329 ns (13 allocations: 1.55 KiB)

julia> @btime multiply($valuelist, $numberofinstance);
  76.469 ns (1 allocation: 240 bytes)
Run Code Online (Sandbox Code Playgroud)