numpy.argmin表示元素大于阈值

tri*_*ook 2 python arrays performance numpy

我有兴趣在满足特定条件的一维NumPy数组中获取最小值的位置(在我的情况下,中等阈值).例如:

import numpy as np

limit = 3
a = np.array([1, 2, 4, 5, 2, 5, 3, 6, 7, 9, 10])
Run Code Online (Sandbox Code Playgroud)

我想有效地掩盖所有在a极限以下的数字,这样结果np.argmin将是6.是否有一种计算上廉价的方法来掩盖不符合条件的值然后应用np.argmin

Max*_*ers 8

这可以简单地使用 numpy 来完成 MaskedArray

import numpy as np

limit = 3
a = np.array([1, 2, 4, 5, 2, 5, 3, 6, 7, 9, 10])
b = np.ma.MaskedArray(a, a<limit)
np.ma.argmin(b)    # == 6
Run Code Online (Sandbox Code Playgroud)

  • 为什么不是 `b.argmin()` ? (2认同)
  • 另一种可能性:`b = np.ma.masked_less(a, limit)`。 (2认同)

Div*_*kar 7

您可以存储有效索引并使用它们来选择有效元素,a并使用argmin()所选元素中的索引来获取最终索引输出.因此,实现看起来像这样 -

valid_idx = np.where(a >= limit)[0]
out = valid_idx[a[valid_idx].argmin()]
Run Code Online (Sandbox Code Playgroud)

样品运行 -

In [32]: limit = 3
    ...: a = np.array([1, 2, 4, 5, 2, 5, 3, 6, 7, 9, 10])
    ...: 

In [33]: valid_idx = np.where(a >= limit)[0]

In [34]: valid_idx[a[valid_idx].argmin()]
Out[34]: 6
Run Code Online (Sandbox Code Playgroud)

运行时测试 -

对于业绩基准,在本节中,我比较other solution based on masked array反对常规基于阵列的解决方案之前在本职位各种datasizes建议.

def masked_argmin(a,limit): # Defining func for regular array based soln
    valid_idx = np.where(a >= limit)[0]
    return valid_idx[a[valid_idx].argmin()]

In [52]: # Inputs
    ...: a = np.random.randint(0,1000,(10000))
    ...: limit = 500
    ...: 

In [53]: %timeit np.argmin(np.ma.MaskedArray(a, a<limit))
1000 loops, best of 3: 233 µs per loop

In [54]: %timeit masked_argmin(a,limit)
10000 loops, best of 3: 101 µs per loop

In [55]: # Inputs
    ...: a = np.random.randint(0,1000,(100000))
    ...: limit = 500
    ...: 

In [56]: %timeit np.argmin(np.ma.MaskedArray(a, a<limit))
1000 loops, best of 3: 1.73 ms per loop

In [57]: %timeit masked_argmin(a,limit)
1000 loops, best of 3: 1.03 ms per loop
Run Code Online (Sandbox Code Playgroud)