numpy互相关 - 矢量化

Sim*_*mon 6 python arrays numpy correlation cross-correlation

我有大量的交叉相关来计算,我正在寻找最快的方法.我假设向量化问题会有所帮助,而不是用循环来做

我有一个标记为电极x时间点x试验的3D阵列(形状:64x256x913).我想计算每次试验时每对电极的时间点的最大互相关性.

具体来说:对于每次试验,我想取每对电极组合并计算每对的最大互相关值.这将导致单行/向量中的4096(64*64)个最大互相关值.这将针对每个试验进行,将每个行/向量堆叠在彼此之上,从而产生包含最大互相关值的最终2D阵列形状913*4096

这是很多计算,但我想尝试找到最快的方法来做到这一点.我使用列表作为容器来模拟一些原型代码,这可能有助于更好地解释问题.可能存在一些逻辑错误,但无论哪种方式代码都不能在我的计算机上运行,​​因为计算python只是冻结了这么多.就这个:

#allData is a 64x256x913 array

all_experiment_trials = []
for trial in range(allData.shape[2]):
    all_trial_electrodes = []
    for electrode in range(allData.shape[0]):
        for other_electrode in range(allData.shape[0]):
            if electrode == other_electrode:
                pass
            else:
                single_xcorr = max(np.correlate(allData[electrode,:,trial], allData[other_electrode,:,trial], "full"))
                all_trial_electrodes.append(single_xcorr)
    all_experiment_trials.append(all_trial_electrodes)
Run Code Online (Sandbox Code Playgroud)

对于这类事情,显然循环非常慢.是否有使用numpy数组的矢量化解决方案?

我已经检查了像correlate2d()之类的东西,但我认为它们并不适用于我的情况,因为我没有将2个矩阵相乘

Div*_*kar 3

这是一种基于的矢量化方法np.einsum-

def vectorized_approach(allData):
    # Get shape
    M,N,R = allData.shape

    # Valid mask based on condition: "if electrode == other_electrode"
    valid_mask = np.mod(np.arange(M*M),M+1)!=0

    # Elementwise multiplications across all elements in axis=0, 
    # and then summation along axis=1
    out = np.einsum('ijkl,ijkl->lij',allData[:,None,:,:],allData[None,:,:,:])

    # Use valid mask to skip columns and have the final output
    return out.reshape(R,-1)[:,valid_mask]
Run Code Online (Sandbox Code Playgroud)

运行时测试并验证结果 -

In [10]: allData = np.random.rand(20,80,200)

In [11]: def org_approach(allData):
    ...:     all_experiment_trials = []
    ...:     for trial in range(allData.shape[2]):
    ...:         all_trial_electrodes = []
    ...:         for electrode in range(allData.shape[0]):
    ...:             for other_electrode in range(allData.shape[0]):
    ...:                 if electrode == other_electrode:
    ...:                     pass
    ...:                 else:
    ...:                     single_xcorr = max(np.correlate(allData[electrode,:,trial], allData[other_electrode,:,trial]))
    ...:                     all_trial_electrodes.append(single_xcorr)
    ...:         all_experiment_trials.append(all_trial_electrodes)
    ...:     return all_experiment_trials
    ...: 

In [12]: %timeit org_approach(allData)
1 loops, best of 3: 1.04 s per loop

In [13]: %timeit vectorized_approach(allData)
100 loops, best of 3: 15.1 ms per loop

In [14]: np.allclose(vectorized_approach(allData),np.asarray(org_approach(allData)))
Out[14]: True
Run Code Online (Sandbox Code Playgroud)