tri*_*ook 2 python arrays performance numpy
我有兴趣在满足特定条件的一维NumPy数组中获取最小值的位置(在我的情况下,中等阈值).例如:
import numpy as np
limit = 3
a = np.array([1, 2, 4, 5, 2, 5, 3, 6, 7, 9, 10])
Run Code Online (Sandbox Code Playgroud)
我想有效地掩盖所有在a极限以下的数字,这样结果np.argmin将是6.是否有一种计算上廉价的方法来掩盖不符合条件的值然后应用np.argmin?
这可以简单地使用 numpy 来完成 MaskedArray
import numpy as np
limit = 3
a = np.array([1, 2, 4, 5, 2, 5, 3, 6, 7, 9, 10])
b = np.ma.MaskedArray(a, a<limit)
np.ma.argmin(b) # == 6
Run Code Online (Sandbox Code Playgroud)
您可以存储有效索引并使用它们来选择有效元素,a并使用argmin()所选元素中的索引来获取最终索引输出.因此,实现看起来像这样 -
valid_idx = np.where(a >= limit)[0]
out = valid_idx[a[valid_idx].argmin()]
Run Code Online (Sandbox Code Playgroud)
样品运行 -
In [32]: limit = 3
...: a = np.array([1, 2, 4, 5, 2, 5, 3, 6, 7, 9, 10])
...:
In [33]: valid_idx = np.where(a >= limit)[0]
In [34]: valid_idx[a[valid_idx].argmin()]
Out[34]: 6
Run Code Online (Sandbox Code Playgroud)
运行时测试 -
对于业绩基准,在本节中,我比较other solution based on masked array反对常规基于阵列的解决方案之前在本职位各种datasizes建议.
def masked_argmin(a,limit): # Defining func for regular array based soln
valid_idx = np.where(a >= limit)[0]
return valid_idx[a[valid_idx].argmin()]
In [52]: # Inputs
...: a = np.random.randint(0,1000,(10000))
...: limit = 500
...:
In [53]: %timeit np.argmin(np.ma.MaskedArray(a, a<limit))
1000 loops, best of 3: 233 µs per loop
In [54]: %timeit masked_argmin(a,limit)
10000 loops, best of 3: 101 µs per loop
In [55]: # Inputs
...: a = np.random.randint(0,1000,(100000))
...: limit = 500
...:
In [56]: %timeit np.argmin(np.ma.MaskedArray(a, a<limit))
1000 loops, best of 3: 1.73 ms per loop
In [57]: %timeit masked_argmin(a,limit)
1000 loops, best of 3: 1.03 ms per loop
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3087 次 |
| 最近记录: |