Python Numpy 与 Matlab:数组赋值性能

Bar*_*akP 5 python arrays performance matlab numpy

我已经编写 Matlab 代码很多年了,最近我开始用 python 编写。让我尝试解释一下我面临的问题:

我的代码的某些部分将一个大数组中的单元格(为了示例而将尺寸为 1080x1400 的图像)关联到一个较小的数组(尺寸为 770x700 的网格)。大阵列中的所有单元可以与整个网格或较小的部分相关联,这意味着大阵列中的大量单元可以与小阵列中的相同单元相关联。我写了两套代码,一套用Matlab,另一套用Python。

由于某种原因,Matlab 代码的平均运行时间为 41 毫秒,而 Pycharm 中的 Python 代码的平均运行时间为 4.1 秒(均测量了 100 次)。我可以做些什么来大幅提高 Numpy 的性能吗?

虽然我总是以向量化的形式编写,但在本例中,代码是用 for 循环编写的,我认为这里很合适。

谢谢

输入数据示例的链接:

https://technionmail-my.sharepoint.com/:x:/g/personal/barakp_campus_technion_ac_il/Eb-JELhUNslJm219qI6bflEBxEv3XnOsGTQaTZN7GfzUbA?e=CeUjRT

https://technionmail-my.sharepoint.com/:i:/g/personal/barakp_campus_technion_ac_il/ETOjjmtzedpBi9YMKfI7778Bz3also9U9acvosMM1gKK0w?e=cQ4afV

Matlab代码:

%%
clear;clc;
InputCoord = readmatrix('InputCoord.csv');
%%
Wx = InputCoord(:,3)' + 1;
Wy = InputCoord(:,4)' + 1;
OutMtx = zeros(770,770);

%%
fp_Row = InputCoord(:,1)' + 1;
fp_Col = InputCoord(:,2)' + 1;
DataMtx = single(imread('DataMtx.tif'))./255;
%%
number_of_times = 100;
t_stop = zeros(number_of_times,1);
for jj = 1:number_of_times
    N = 1;
    t_start = tic;
    for ii = 1:size(Wx,2)
        Wx_ind = Wx(ii);
        Wy_ind = Wy(ii);
        fp_Row_ind = fp_Row(ii);
        fp_Col_ind = fp_Col(ii);
        if ii>1 && (Wx(ii)~=Wx(ii-1) || Wy(ii)~=Wy(ii-1))
            N = 1;
        end
        
        OutMtx(Wx_ind, Wy_ind) = ((N-1)*OutMtx(Wx_ind, Wy_ind) + DataMtx(fp_Row_ind, fp_Col_ind))/N;
        N = N + 1;
    end
    
    t_stop(jj) = toc(t_start);
end
Run Code Online (Sandbox Code Playgroud)

Python代码:

import numpy as np
import cv2
import time


InputCoord = np.genfromtxt('InputCoord.csv', delimiter=',')
number_of_coords = np.shape(InputCoord)[0]
Wx = InputCoord[:, 2].astype(dtype=np.int32).reshape((1, number_of_coords))
Wy = InputCoord[:, 3].astype(dtype=np.int32).reshape((1, number_of_coords))
OutMtx = np.zeros((770, 770))

fp_Row = InputCoord[:, 0].astype(dtype=np.int32).reshape((1, number_of_coords))
fp_Col = InputCoord[:, 1].astype(dtype=np.int32).reshape((1, number_of_coords))
DataMtx = cv2.imread('DataMtx.tif', -1).astype(dtype=np.float32) / 255
# print(f' DataMtx flags:{DataMtx.flags}')
DataMtxf = np.asarray(DataMtx, order='F')
number_of_times = 100
t_stop = np.zeros((1, number_of_times))
for jj in range(number_of_times):
    t_start = time.time()
    N = 1
    for ii in range(number_of_coords):
        Wx_ind = Wx[0, ii]
        Wy_ind = Wy[0, ii]
        fp_Row_ind = fp_Row[0, ii]
        fp_Col_ind = fp_Col[0, ii]
        if (ii > 1) and ((Wx[0, ii] != Wx[0, ii - 1]) or (Wy[0, ii] != Wy[0, ii - 1])):
            N = 1

        OutMtx[Wx_ind, Wy_ind] = ((N - 1) * OutMtx[Wx_ind, Wy_ind] + DataMtx[fp_Row_ind, fp_Col_ind]) / N
        N = N + 1
    t_stop[0, jj] = time.time() - t_start
print(f'mean update time = {np.mean(t_stop)}')
Run Code Online (Sandbox Code Playgroud)

Bar*_*akP 2

问题解决了:

我使用了 numba 和 jit 编译,现在 Python 代码的平均运行时间为 17 毫秒!

谢谢

import numpy as np
import cv2
import time
import numba

@numba.jit(nopython=True)
def Pix2Grid_MovAvg(DataMtx, OutMtx, Wx, Wy, fp_Row, fp_Col, number_of_coords):
    N = 1

    for ii in range(number_of_coords):
        Wx_ind = Wx[0, ii]
        Wy_ind = Wy[0, ii]
        fp_Row_ind = fp_Row[0, ii]
        fp_Col_ind = fp_Col[0, ii]
        if (ii > 1) and ((Wx[0, ii] != Wx[0, ii - 1]) or (Wy[0, ii] != Wy[0, ii - 1])):
            N = 1

        OutMtx[Wx_ind, Wy_ind] = ((N - 1) * OutMtx[Wx_ind, Wy_ind] + DataMtx[fp_Row_ind, fp_Col_ind]) / N
        N += 1
    return OutMtx

Run Code Online (Sandbox Code Playgroud)