使用`numpy.digitize`拆分NumPy数组后如何计算每个bin的平均值？

Question

使用`numpy.digitize`拆分NumPy数组后如何计算每个bin的平均值？

fin*_*oot 3 python arrays numpy scientific-computing scipy

我有一个输入数组，它被分成多个 bin，我想计算这些 bin 的平均值。让我们假设以下示例：

>>> import numpy as np
>>> a = np.array([1.4, 2.6, 0.7, 1.1])

Run Code Online (Sandbox Code Playgroud)

哪个被分成垃圾箱np.digitize：

>>> bins = np.arange(0, 2 + 1)
>>> indices = np.digitize(a, bins)
>>> indices
array([2, 3, 1, 2])

Run Code Online (Sandbox Code Playgroud)

这正是我期望它做的事情，你可以在这里更明确地看到：

>>> for i in range(len(bins)):
...     f"bin where {i} <= x < {i + 1} contains {a[indices == i + 1]}"
... 
'bin where 0 <= x < 1 contains [0.7]'
'bin where 1 <= x < 2 contains [1.4 1.1]'
'bin where 2 <= x < 3 contains [2.6]'

Run Code Online (Sandbox Code Playgroud)

但是，现在我想获得每个 bin 的平均值。使用for循环以非 NumPy 方式执行此操作将如下所示：

>>> b = np.array([a[indices == i + 1].mean() for i in range(len(bins))])
>>> b
array([0.7 , 1.25, 2.6 ])

Run Code Online (Sandbox Code Playgroud)

但是for为此使用循环似乎既不优雅（pythonic），也不高效，因为列表必须np.array随后转换为 NumPy 数组。

NumPy 的方法是什么？

Answer 1

Qua*_*ang 5

IIUC，这是bincount：

np.bincount(indices-1,a)/np.bincount(indices-1)

Run Code Online (Sandbox Code Playgroud)

输出：

array([0.7, 1.25, 2.6])

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，10 月前
查看次数：	165 次
最近记录：	5 年，10 月前