numpy 3D数组的卷积加速循环?

use*_*612 8 python arrays for-loop numpy convolution

沿着3d numpy数组的Z向量执行卷积,然后对结果执行其他操作,但现在实现它很慢.for循环是什么让我放慢速度,或者是卷积?我尝试重塑为1d向量并在1次传递中执行卷积(就像我在Matlab中所做的那样),没有for循环,但它没有提高性能.我的Matlab版本比我在Python中提出的任何东西快50%.代码的相关部分:

convolved=np.zeros((y_lines,x_lines,z_depth))
for i in range(0, y_lines):
    for j in range(0, x_lines):
        convolved[i,j,:]= fftconvolve(data[i,j,:], Gauss) #80% of time here
        result[i,j,:]= other_calculations(convolved[i,j,:]) #20% of time here
Run Code Online (Sandbox Code Playgroud)

有没有比for循环更好的方法呢?听说过Cython,但我目前在Python方面的经验有限,我们的目标是最简单的解决方案.

Cur*_* F. 5

fftconvolve您正在使用的功能可能来自SciPy。如果是这样,请注意它需要N维数组。因此,进行卷积的更快方法是生成3d内核,该内核对应于在xy维度中不执行任何操作以及在其中进行1d高斯卷积。z.

下面是一些代码和计时结果。在我的机器上,并带有一些玩具数据,这导致了10倍的加速,如您所见:

import numpy as np
from scipy.signal import fftconvolve
from scipy.ndimage.filters import gaussian_filter

# use scipy filtering functions designed to apply kernels to isolate a 1d gaussian kernel
kernel_base = np.ones(shape=(5))
kernel_1d = gaussian_filter(kernel_base, sigma=1, mode='constant')
kernel_1d = kernel_1d / np.sum(kernel_1d)

# make the 3d kernel that does gaussian convolution in z axis only
kernel_3d = np.zeros(shape=(1, 1, 5,))
kernel_3d[0, 0, :] = kernel_1d

# generate random data
data = np.random.random(size=(50, 50, 50))

# define a function for loop based convolution for easy timeit invocation
def convolve_with_loops(data):
    nx, ny, nz = data.shape
    convolved=np.zeros((nx, ny, nz))
    for i in range(0, nx):
        for j in range(0, ny):
            convolved[i,j,:]= fftconvolve(data[i, j, :], kernel_1d, mode='same') 
    return convolved

# compute the convolution two diff. ways: with loops (first) or as a 3d convolution (2nd)
convolved = convolve_with_loops(data)
convolved_2 = fftconvolve(data, kernel_3d, mode='same')

# raise an error unless the two computations return equivalent results
assert np.all(np.isclose(convolved, convolved_2))

# time the two routes of the computation
%timeit convolved = convolve_with_loops(data)
%timeit convolved_2 = fftconvolve(data, kernel_3d, mode='same')
Run Code Online (Sandbox Code Playgroud)

timeit 结果:

10 loops, best of 3: 198 ms per loop
100 loops, best of 3: 18.1 ms per loop
Run Code Online (Sandbox Code Playgroud)