在 NumPy 数组中查找到最近零的距离

Question

在 NumPy 数组中查找到最近零的距离

假设我有一个 NumPy 数组：

x = np.array([0, 1, 2, 0, 4, 5, 6, 7, 0, 0])

Run Code Online (Sandbox Code Playgroud)

在每个索引处，我想找到到最近的零值的距离。如果位置本身为零，则返回零作为距离。之后，我们只对与当前位置右侧的最近零的距离感兴趣。超级幼稚的方法是这样的：

out = np.full(x.shape[0], x.shape[0]-1)
for i in range(x.shape[0]):
    j = 0
    while i + j < x.shape[0]:
        if x[i+j] == 0:
            break
        j += 1
    out[i] = j

Run Code Online (Sandbox Code Playgroud)

输出将是：

array([0, 2, 1, 0, 4, 3, 2, 1, 0, 0])

Run Code Online (Sandbox Code Playgroud)

我注意到输出中零之间的倒计时/递减模式。所以，我也许可以使用零的位置（即，zero_indices = np.argwhere(x == 0).flatten()）

在线性时间内获得所需输出的最快方法是什么？

Answer 1

Div*_*kar 10

方法#1： Searchsorted以矢量化方式拯救线性时间（在 numba 人进来之前）！

mask_z = x==0
idx_z = np.flatnonzero(mask_z)
idx_nz = np.flatnonzero(~mask_z)

# Cover for the case when there's no 0 left to the right
# (for same results as with posted loop-based solution)
if x[-1]!=0:
    idx_z = np.r_[idx_z,len(x)]

out = np.zeros(len(x), dtype=int)
idx = np.searchsorted(idx_z, idx_nz)
out[~mask_z] = idx_z[idx] - idx_nz

Run Code Online (Sandbox Code Playgroud)

方法#2：另一种cumsum-

mask_z = x==0
idx_z = np.flatnonzero(mask_z)

# Cover for the case when there's no 0 left to the right
if x[-1]!=0:
    idx_z = np.r_[idx_z,len(x)]

out = idx_z[np.r_[False,mask_z[:-1]].cumsum()] - np.arange(len(x))

Run Code Online (Sandbox Code Playgroud)

或者，最后一步cumsum可以替换为repeat功能 -

r = np.r_[idx_z[0]+1,np.diff(idx_z)]
out = np.repeat(idx_z,r)[:len(x)] - np.arange(len(x))

Run Code Online (Sandbox Code Playgroud)

方法 #3：另一个主要是cumsum-

mask_z = x==0
idx_z = np.flatnonzero(mask_z)

pp = np.full(len(x), -1)
pp[idx_z[:-1]] = np.diff(idx_z) - 1
if idx_z[0]==0:
    pp[0] = idx_z[1]
else:
    pp[0] = idx_z[0]
out = pp.cumsum()

# Handle boundary case and assigns 0s at original 0s places
out[idx_z[-1]:] = np.arange(len(x)-idx_z[-1],0,-1)
out[mask_z] = 0

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，8 月前
查看次数：	1257 次
最近记录：	5 年，8 月前