Python numpy.random.choice:ValueError:p 中的非零条目少于 size

Ran*_*ani 7 python numpy

我想根据prob给定行指定的概率分布随机选择样本点。ValueError: Fewer non-zero entries in p than size但是,当我调用 时,我收到错误np.random.choice。甚至是什么意思size?我也查看了实现但我不明白。谢谢你的帮助!!

import numpy as np

# prob is a numpy array of shape (14, 6890)
all_zero = np.where(prob.max(1) < 1e-6)[0] # find indices of rows where all values are smaller
prob[all_zero] = 1 / prob.shape[1] # fill those rows uniformly
prob /= prob.sum(axis=1, keepdims=True)
# ... somewhere later inside a method
for j in range(14):
    sample = np.random.choice(6890, 4, replace=False, p=prob[j]) # error occurs here
Run Code Online (Sandbox Code Playgroud)

ave*_*ler 5

问题在于您使用要求在 6890 个条目的数组中np.random.choice选择4 个条目而不重用值 ( ) ,其中非空值少于 4 个,例如:replace=False

>>> np.random.choice(5, 1, replace=False, p=[0, 0, 0, 0.6, 0.4])
array([4])

>>> np.random.choice(5, 4, replace=False, p=[0, 0, 0, 0.6, 0.4])
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    np.random.choice(5, 4, replace=False, p=[0, 0, 0, 0.6, 0.4])
  File "mtrand.pyx", line 826, in numpy.random.mtrand.RandomState.choice
ValueError: Fewer non-zero entries in p than size

>>> np.random.choice(5, 4, replace=True, p=[0, 0, 0, 0.6, 0.4])
array([3, 3, 4, 3])
Run Code Online (Sandbox Code Playgroud)

因此,分辨率取决于您的需要,您要么确保有更多的非空值,要么在随机选择中启用替换。

作为参考,numpy.random.choice 的文档: