在给定一系列bin概率的情况下,如何生成随机的bin计数样本？

Question

在给定一系列bin概率的情况下,如何生成随机的bin计数样本？

Mat*_*att 5 python random numpy vectorization probability-density

我有一个整数需要根据概率分布分成二进制位.例如,如果我有N=100对象进入[0.02, 0.08, 0.16, 0.29, 0.45]那么你可能会得到[1, 10, 20, 25, 44].

import numpy as np
# sample distribution
d = np.array([x ** 2 for x in range(1,6)], dtype=float)
d = d / d.sum()
dcs = d.cumsum()
bins = np.zeros(d.shape)
N = 100
for roll in np.random.rand(N):
    # grab the first index that the roll satisfies
    i = np.where(roll < dcs)[0][0]  
    bins[i] += 1

Run Code Online (Sandbox Code Playgroud)

实际上,N和我的箱数非常大,因此循环并不是一个可行的选择.有什么方法可以将此操作进行矢量化以加快速度吗？

Answer 1

ali*_*i_m 5

您可以通过获取 cumsum 将您的 PDF 转换为 CDF，使用它来定义一组介于 0 和 1 之间的 bin，然后使用这些 bin 来计算N长随机均匀向量的直方图：

cdf = np.cumsum([0, 0.02, 0.08, 0.16, 0.29, 0.45])     # leftmost bin edge = 0
counts, edges = np.histogram(np.random.rand(100), bins=cdf)

print(counts)
# [ 4,  8, 16, 30, 42]

Run Code Online (Sandbox Code Playgroud)

归档时间：	10 年，7 月前
查看次数：	615 次
最近记录：	10 年，7 月前