我想通过包含每个实例的数量的向量来扩展值的向量.我已经提出了以下代码来完成这项工作,但似乎这是一个常见的用途,所以我可能错过了一些东西.
valuelist = ["a","b","d","z"]
numberofinstance = [3,5,1,11]
valuevector = String[]
for i in 1:length(numberofinstance)
append!(valuevector , repeat([valuelist[i]], numberofinstance[i]))
end
Run Code Online (Sandbox Code Playgroud)
如果使用包(基本上是stdlib)没问题,那么inverse_rle在StatsBase.jl中调用您要查找的函数:
julia> using StatsBase
julia> inverse_rle(valuelist, numberofinstance)
20-element Array{String,1}:
"a"
"a"
"a"
"b"
"b"
"b"
"b"
"b"
"d"
"z"
"z"
"z"
"z"
"z"
"z"
"z"
"z"
"z"
"z"
"z"
julia> @btime inverse_rle($valuelist, $numberofinstance);
76.799 ns (1 allocation: 240 bytes)
julia> @btime yoursolution($valuelist, $numberofinstance);
693.329 ns (13 allocations: 1.55 KiB)
Run Code Online (Sandbox Code Playgroud)
如果你想避免包裹,你原则上可以广播repeat或^(电力)这样,
vcat(collect.(.^(valuelist, numberofinstance))...)
但我认为这是比较难以解析,也慢inverse_rle,
julia> @btime yoursolution($valuelist, $numberofinstance);
693.329 ns (13 allocations: 1.55 KiB)
julia> @btime vcat(collect.(.^($valuelist, $numberofinstance))...)
472.615 ns (9 allocations: 800 bytes)
Run Code Online (Sandbox Code Playgroud)
但是,由于Julia允许您编写快速循环,因此您可以轻松定义自己的简单函数.以下内容比您的解决方案快得多(与实现StatsBase速度一样快):
function multiply(vs, ns)
r = Vector{String}(undef, sum(ns))
c = 1
@inbounds for i in axes(ns, 1)
for k in 1:ns[i]
r[c] = vs[i]
c += 1
end
end
r
end
Run Code Online (Sandbox Code Playgroud)
基准测试:
julia> @btime yoursolution($valuelist, $numberofinstance);
693.329 ns (13 allocations: 1.55 KiB)
julia> @btime multiply($valuelist, $numberofinstance);
76.469 ns (1 allocation: 240 bytes)
Run Code Online (Sandbox Code Playgroud)