Inq*_*rer 7 python random numpy
我正在尝试在0到999之间采样1000个数字,其中一个权重向量决定了选择特定数字的概率:
import numpy as np
resampled_indices = np.random.choice(a = 1000, size = 1000, replace = True, p = weights)
Run Code Online (Sandbox Code Playgroud)
不幸的是,这个过程必须在更大的for循环中运行数千次,而且似乎这np.random.choice是该过程中的主要速度瓶颈.因此,我想知道是否有任何方法可以加速np.random.choice或使用提供相同结果的替代方法.
看来您可以通过使用均匀采样然后使用以下方法“反转”累积分布来做得稍微快一点np.searchsorted:
# assume arbitrary probabilities\nweights = np.random.randn(1000)**2\nweights /= weights.sum()\n\ndef weighted_random(w, n):\n cumsum = np.cumsum(w)\n rdm_unif = np.random.rand(n)\n return np.searchsorted(cumsum, rdm_unif)\n\n# first method\n%timeit np.random.choice(a = 1000, size = 1000, replace = True, p = weights)\n# 10000 loops, best of 3: 220 \xc2\xb5s per loop\n\n# proposed method\n%timeit weighted_random(weights, n)\n# 10000 loops, best of 3: 158 \xc2\xb5s per loop\nRun Code Online (Sandbox Code Playgroud)\n\n现在我们可以凭经验检查概率是否正确:
\n\n# assume arbitrary probabilities\nweights = np.random.randn(1000)**2\nweights /= weights.sum()\n\ndef weighted_random(w, n):\n cumsum = np.cumsum(w)\n rdm_unif = np.random.rand(n)\n return np.searchsorted(cumsum, rdm_unif)\n\n# first method\n%timeit np.random.choice(a = 1000, size = 1000, replace = True, p = weights)\n# 10000 loops, best of 3: 220 \xc2\xb5s per loop\n\n# proposed method\n%timeit weighted_random(weights, n)\n# 10000 loops, best of 3: 158 \xc2\xb5s per loop\nRun Code Online (Sandbox Code Playgroud)\n