小编Pau*_*Pau的帖子

使用 numba 更快的 numpy 替代字符串

np.isin我正在尝试实现更快的in版本numba,这是我到目前为止所拥有的:

import numpy as np
import numba as nb

@nb.njit(parallel=True)
def isin(a, b):
    out=np.empty(a.shape[0], dtype=nb.boolean)
    b = set(b)
    for i in nb.prange(a.shape[0]):
        if a[i] in b:
            out[i]=True
        else:
            out[i]=False
    return out
Run Code Online (Sandbox Code Playgroud)

对于数字来说它是有效的,如下例所示:

a = np.array([1,2,3,4])
b = np.array([2,4])

isin(a,b)
>>> array([False,  True, False,  True])
Run Code Online (Sandbox Code Playgroud)

而且它比以下更快np.isin

a = np.random.rand(20000)
b = np.random.rand(5000)

%time isin(a,b)
CPU times: user 3.96 ms, sys: 0 ns, total: 3.96 ms
Wall time: 1.05 ms

%time np.isin(a,b)
CPU times: user 11 ms, …
Run Code Online (Sandbox Code Playgroud)

python string performance numpy numba

5
推荐指数
1
解决办法
1443
查看次数

标签 统计

numba ×1

numpy ×1

performance ×1

python ×1

string ×1