在MATLAB中优化重复估计(当前是一个循环)

Fre*_*ick 3 math optimization matlab matrix least-squares

我发现自己需要对图像中的每个像素进行最小二乘(或类似的基于矩阵的操作).每个像素都有一组与之关联的数字,因此可以将其排列为3D矩阵.

(可以跳过下一位)

通过最小二乘估计快速解释我的意思:

假设我们有一些二次系统,它由Y = Ax ^ 2 + Bx + C建模,我们正在寻找那些A,B,C系数.使用X的几个样本(至少3个)和相应的Y,我们可以通过以下方式估算它们:

  1. 将(例如10个)X样本排列成矩阵 X = [x(:).^2 x(:) ones(10,1)];
  2. 将Y个样本排列成类似的矩阵: Y = y(:);
  3. 通过求解来估算系数A,B,C: coeffs = (X'*X)^(-1)*X'*Y;

如果您愿意,可以自己尝试:

A = 5; B = 2; C = 1;
x = 1:10;
y = A*x(:).^2 + B*x(:) + C + .25*randn(10,1); % added some noise here
X = [x(:).^2 x(:) ones(10,1)];
Y = y(:);
coeffs = (X'*X)^-1*X'*Y

coeffs =

  5.0040
  1.9818
  0.9241
Run Code Online (Sandbox Code Playgroud)

如果我迷失了你,请继续注意

*MAJOR REWRITE*我已修改为使其接近我所拥有的真正问题,并仍然使其成为最小的工作示例.

问题设置

%// Setup
xdim = 500; 
ydim = 500; 
ncoils = 8; 
nshots = 4; 
%// matrix size for each pixel is ncoils x nshots (an overdetermined system)

%// each pixel has a matrix stored in the 3rd and 4rth dimensions
regressor = randn(xdim,ydim, ncoils,nshots); 
regressand = randn(xdim, ydim,ncoils); 
Run Code Online (Sandbox Code Playgroud)

所以我的问题是我必须为图像中的每个像素做一个(X'*X)^ - 1*X'*Y(最小二乘或类似)操作.虽然它本身是矢量化/矩阵化的,但我必须为每个像素执行它的唯一方法是在for循环中,例如:

原始代码风格

%// Actual work
tic 
estimate = zeros(xdim,ydim);
for col=1:size(regressor,2)
    for row=1:size(regressor,1)

        X = squeeze(regressor(row,col,:,:));
        Y = squeeze(regressand(row,col,:));

        B = X\Y; 
        % B = (X'*X)^(-1)*X'*Y; %// equivalently

        estimate(row,col) = B(1);
    end
end
toc

Elapsed time = 27.6 seconds
Run Code Online (Sandbox Code Playgroud)

编辑响应评论和其他想法
我尝试了一些事情:
1.重塑为长矢量并删除双for循环.这节省了一些时间.
2. squeeze通过permute手边的图片删除(和在线移调):这节省了更多的时间.

目前的例子:

%// Actual work
tic 
estimate2 = zeros(xdim*ydim,1);
regressor_mod = permute(regressor,[3 4 1 2]);
regressor_mod = reshape(regressor_mod,[ncoils,nshots,xdim*ydim]);
regressand_mod = permute(regressand,[3 1 2]);
regressand_mod = reshape(regressand_mod,[ncoils,xdim*ydim]);

for ind=1:size(regressor_mod,3) % for every pixel

    X = regressor_mod(:,:,ind);
    Y = regressand_mod(:,ind);

    B = X\Y;

    estimate2(ind) = B(1);

end
estimate2 = reshape(estimate2,[xdim,ydim]);
toc

Elapsed time = 2.30 seconds (avg of 10)
isequal(estimate2,estimate) == 1;
Run Code Online (Sandbox Code Playgroud)

Rody Oldenhuis的方式

N  = xdim*ydim*ncoils;  %// number of columns
M  = xdim*ydim*nshots;    %// number of rows

ii = repmat(reshape(1:N,[ncoils,xdim*ydim]),[nshots 1]); %//column indicies
jj = repmat(1:M,[ncoils 1]); %//row indicies

X = sparse(ii(:),jj(:),regressor_mod(:));
Y = regressand_mod(:);

B = X\Y;

B = reshape(B(1:nshots:end),[xdim ydim]);

Elapsed time = 2.26 seconds (avg of 10) 
            or 2.18 seconds (if you don't include the definition of N,M,ii,jj)
Run Code Online (Sandbox Code Playgroud)

所以问题是:
是否有(甚至)更快的方式?

(我不这么认为.)

MrA*_*man 6

通过预先计算X的转置,你可以实现2倍的加速.即

for x=1:size(picture,2) % second dimension b/c already transposed

    X = picture(:,x);
    XX = X';
    Y = randn(n_timepoints,1);
    %B = (X'*X)^-1*X'*Y; ;
    B = (XX*X)^-1*XX*Y; 
    est(x) = B(1);

end

Before: Elapsed time is 2.520944 seconds.
After: Elapsed time is 1.134081 seconds.
Run Code Online (Sandbox Code Playgroud)

编辑:您的代码,因为它在您的最新编辑中,可以替换为以下

tic
xdim = 500; 
ydim = 500; 
n_timepoints = 10; % for example

% Actual work
picture = randn(xdim,ydim,n_timepoints);
picture = reshape(picture, [xdim*ydim,n_timepoints])'; % note transpose
YR = randn(n_timepoints,size(picture,2));

% (XX*X).^-1 = sum(picture.*picture).^-1;
% XX*Y = sum(picture.*YR);
est = sum(picture.*picture).^-1 .* sum(picture.*YR);

est = reshape(est,[xdim,ydim]);
toc

Elapsed time is 0.127014 seconds.
Run Code Online (Sandbox Code Playgroud)

这是最新编辑速度的一个数量级,结果与之前的方法完全相同.

EDIT2:

好的,所以如果X是一个矩阵,而不是一个向量,那么事情就会复杂一些.我们基本上希望尽可能在for-loop之外进行预先计算,以降低成本.我们也可以通过XT*X手动计算获得显着的加速- 因为结果将始终是对称矩阵,我们可以削减一些角来加快速度.一,对称乘法函数:

function XTX = sym_mult(X) % X is a 3-d matrix

n = size(X,2);
XTX = zeros(n,n,size(X,3));
for i=1:n
    for j=i:n
        XTX(i,j,:) = sum(X(:,i,:).*X(:,j,:));
        if i~=j
            XTX(j,i,:) = XTX(i,j,:);
        end
    end
end
Run Code Online (Sandbox Code Playgroud)

现在是实际的计算脚本

xdim = 500; 
ydim = 500; 
n_timepoints = 10; % for example

Y = randn(10,xdim*ydim);
picture = randn(xdim,ydim,n_timepoints); % 500x500x10

% Actual work  
tic  % start timing

picture = reshape(picture, [xdim*ydim,n_timepoints])';

% Here we precompute the (XT*Y) calculation to speed things up later
picture_y = [sum(Y);sum(Y.*picture)]; 

% initialize
est = zeros(size(picture,2),1); 

picture = permute(picture,[1,3,2]);
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture);
XTX = sym_mult(XTX); % precompute (XT*X) for speed

X = zeros(2,2); % preallocate for speed
XY = zeros(2,1);

for x=1:size(picture,2) % second dimension b/c already transposed

    %For some reason this is a lot faster than X = XTX(:,:,x);
    X(1,1) = XTX(1,1,x);
    X(2,1) = XTX(2,1,x);
    X(1,2) = XTX(1,2,x);
    X(2,2) = XTX(2,2,x);
    XY(1) = picture_y(1,x);
    XY(2) = picture_y(2,x);

    % Here we utilise the fact that A\B is faster than inv(A)*B
    % We also use the fact that (A*B)*C = A*(B*C) to speed things up
    B = X\XY;
    est(x) = B(1); 
end
est = reshape(est,[xdim,ydim]); 
toc % end timing

Before: Elapsed time is 4.56 seconds.
After: Elapsed time is 2.24 seconds.
Run Code Online (Sandbox Code Playgroud)

这是一个大约2倍的速度.这个代码应该是可扩展的,X是你想要的任何尺寸.例如,在X = [1 xx ^ 2]的情况下,您将更改为picture_y以下内容

picture_y = [sum(Y);sum(Y.*picture);sum(Y.*picture.^2)];
Run Code Online (Sandbox Code Playgroud)

XTX改为

XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture,picture.^2); 
Run Code Online (Sandbox Code Playgroud)

您还可以在代码中将大量2s更改为3s,然后添加XY(3) = picture_y(3,x)到循环中.我相信它应该是相当直接的.