Numba nopython 模式不能接受二维布尔索引

mat*_*guy 4 python indexing numpy python-3.x numba

我正在尝试使用numba(目前我正在使用numba 0.45.1)来加速代码,但遇到了布尔索引的问题。代码如下:

from numba import njit
import numpy as np

n_max = 1000

n_arr = np.hstack((np.arange(1,3),
                   np.arange(3,n_max, 3)
                   ))

@njit
def func(arr):
    idx =  np.arange(arr[-1]).reshape((-1,1)) < arr -2
    result = np.zeros(idx.shape)
    result[idx] = 10.1
    return result

new_arr = func(n_arr)
Run Code Online (Sandbox Code Playgroud)

一旦我运行代码,我就会收到以下消息

TypingError: Invalid use of Function(<built-in function setitem>) with argument(s) of type(s): (array(float64, 2d, C), array(bool, 2d, C), float64)
 * parameterized
In definition 0:
    All templates rejected with literals.
In definition 1:
    All templates rejected without literals.
In definition 2:
    All templates rejected with literals.
In definition 3:
    All templates rejected without literals.
In definition 4:
    All templates rejected with literals.
In definition 5:
    All templates rejected without literals.
In definition 6:
    All templates rejected with literals.
In definition 7:
    All templates rejected without literals.
In definition 8:
    TypeError: unsupported array index type array(bool, 2d, C) in [array(bool, 2d, C)]
    raised from C:\Users\User\Anaconda3\lib\site-packages\numba\typing\arraydecl.py:71
In definition 9:
    TypeError: unsupported array index type array(bool, 2d, C) in [array(bool, 2d, C)]
    raised from C:\Users\User\Anaconda3\lib\site-packages\numba\typing\arraydecl.py:71
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: typing of setitem at C:/Users/User/Desktop/all python file/5.5.5/numba index broadcasting2.py (29)
Run Code Online (Sandbox Code Playgroud)

请注意,(29)最后一行对应于第 29 行,即result[idx] = 10.1我尝试为索引为idx2-D 布尔索引的结果赋值的行。


我想解释一下,result[idx] = 10.1在里面包含该语句@njit是必须的。尽管我想在 中排除这个语句@njit,但我不能,因为这一行正好位于我正在处理的代码的中间。

如果我坚持在result[idx] = 10.1里面包含赋值语句@njit,那么到底需要改变什么才能使它工作?如果可能的话,我希望看到一些代码示例,其中包含@njit可以运行的二维布尔索引。

谢谢

Jos*_*del 5

Numba 目前不支持使用 2D 数组进行花式索引。看:

https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html#array-access

但是,您可以通过使用 for 循环显式重写您的函数而不是依赖广播来获得等效的行为:

from numba import njit
import numpy as np

n_max = 1000

n_arr = np.hstack((np.arange(1,3),
                   np.arange(3,n_max, 3)
                   ))

def func(arr):
    idx =  np.arange(arr[-1]).reshape((-1,1)) < arr -2
    result = np.zeros(idx.shape)
    result[idx] = 10.1
    return result

@njit
def func2(arr):
    M = arr[-1]
    N = arr.shape[0]
    result = np.zeros((M, N))
    for i in range(M):
        for j in range(N):
            if i < arr[j] - 2:
                result[i, j] = 10.1

    return result

new_arr = func(n_arr)
new_arr2 = func2(n_arr)
print(np.allclose(new_arr, new_arr2))  # True
Run Code Online (Sandbox Code Playgroud)

在我的机器上,使用您提供的示例输入,func2func.

  • 有趣的。看到在这种情况下广播比 @njit 中的 for 循环慢,我想知道这对于大多数大数据操作是否成立?我问这个问题是因为虽然我在这个例子中只在“numpy”数组中进行广播,但将来我将在“tensorflow”设置中进行广播,所以知道这个问题的答案真的很棒。 (2认同)