我具有以下结构的功能,
@numba.jit(nopython = True)
def foo(X,N):
'''
:param X: 1D numpy array
:param N: Integer
:rtype: 2D numpy array of shape len(X) x N
'''
out = np.ones((len(X),N))
out[:,0] = X
for i in range(1,N):
out[:,i] = X**i+out[:,i-1]
return out
Run Code Online (Sandbox Code Playgroud)
我现在正尝试在我的GPU上运行。到目前为止,我尝试以非向量化形式编写函数(即,将X的每个条目分开对待),并将返回数组作为输入传递:
def foo_cuda(x,N,out):
'''
:param x: Scalar
:param N: Integer
:rtype: 1D numpy array of length N
'''
out[0] = x
for i in range(1,N):
out[i] = x**i+out[i-1]
Run Code Online (Sandbox Code Playgroud)
但是,我不知道该功能使用什么装饰器。如果我用
@numba.vectorize([(float64,int64,float64[:])],target = 'cuda') 我懂了 TypeError: Buffer dtype cannot be buffer@numba.guvectorize([(float64,int64,float64[:])],'(),()->(n)',target …