如何从Julia中的加权数组中选择随机项？

Question

如何从Julia中的加权数组中选择随机项？

Rem*_*i.b 20 arrays random random-sample julia

考虑两个1-dim数组,一个包含要从中选择的项,另一个包含绘制另一个列表项的概率.

items = ["a", 2, 5, "h", "hello", 3]
weights = [0.1, 0.1, 0.2, 0.2, 0.1, 0.3]

Run Code Online (Sandbox Code Playgroud)

在朱莉娅,怎么能随意在选择项目items使用weights对重量绘图给定项目的概率是多少？

Answer 1

Iai*_*ing 19

使用StatsBase.jl包,即

Pkg.add("StatsBase")  # Only do this once, obviously
using StatsBase
items = ["a", 2, 5, "h", "hello", 3]
weights = [0.1, 0.1, 0.2, 0.2, 0.1, 0.3]
sample(items, Weights(weights))

Run Code Online (Sandbox Code Playgroud)

或者如果你想抽样很多:

# With replacement
my_samps = sample(items, Weights(weights), 10)
# Without replacement
my_samps = sample(items, Weights(weights), 2, replace=false)

Run Code Online (Sandbox Code Playgroud)

(请注意,在朱莉娅> = 1.0就应该更换Weights使用WeightVec).

您可以在文档中了解有关Weights它的更多信息以及它存在的原因.采样算法非常有效,并且根据输入的大小设计使用不同的方法.StatsBase

Answer 2

Mil*_*les 5

这是一个更简单的方法，仅使用 Julia 的基础库：

sample(items, weights) = items[findfirst(cumsum(weights) .> rand())]

Run Code Online (Sandbox Code Playgroud)

例子：

>>> sample(["a", 2, 5, "h", "hello", 3], [0.1, 0.1, 0.2, 0.2, 0.1, 0.3])
"h"

Run Code Online (Sandbox Code Playgroud)

这比效率低StatsBase.jl，但对于小向量来说没问题。

另外，如果weights不是标准化向量，您可以执行以下操作：

sample(items, weights) = items[findfirst(cumsum(weights) .> rand() * sum(weights))]

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年前
查看次数：	2825 次
最近记录：	6 年，6 月前