索引创建的性能

Den*_*din 10 performance matlab

在尝试选择推荐哪种索引方法时,我尝试了测试性能.然而,这些测量让我很困惑.我以不同的顺序多次运行,但测量结果保持一致.以下是我测量性能的方法:

for N = [10000 15000 100000 150000]
    x =  round(rand(N,1)*5)-2;
    idx1 = x~=0;
    idx2 = abs(x)>0;

    tic
    for t = 1:5000
        idx1 = x~=0;
    end
    toc

    tic
    for t = 1:5000
        idx2 = abs(x)>0;
    end
    toc
end
Run Code Online (Sandbox Code Playgroud)

这就是结果:

Elapsed time is 0.203504 seconds.
Elapsed time is 0.230439 seconds.

Elapsed time is 0.319840 seconds.
Elapsed time is 0.352562 seconds.

Elapsed time is 2.118108 seconds. % This is the strange part
Elapsed time is 0.434818 seconds.

Elapsed time is 0.508882 seconds.
Elapsed time is 0.550144 seconds.
Run Code Online (Sandbox Code Playgroud)

我检查了大约100000的值,这也发生了,即使在50000,也会发生奇怪的测量.

所以我的问题是:是否有其他人在一定范围内经历过这种情况,是什么导致这种情况?(这是一个错误?)

nkj*_*kjt 7

我认为这与JIT有关(以下结果使用的是2011b).根据系统,Matlab的版本,变量的大小以及循环中的确切内容,使用JIT并不总是更快.这与"预热"效果有关,有时如果你在一个会话中多次运行一个m文件,它会在第一次运行后更快,因为加速器只需要编译一次代码的一部分.

JIT on(功能加速)

Elapsed time is 0.176765 seconds.
Elapsed time is 0.185301 seconds.

Elapsed time is 0.252631 seconds.
Elapsed time is 0.284415 seconds.

Elapsed time is 1.782446 seconds.
Elapsed time is 0.693508 seconds.

Elapsed time is 0.855005 seconds.
Elapsed time is 1.004955 seconds.
Run Code Online (Sandbox Code Playgroud)

JIT关闭(功能加速关闭)

Elapsed time is 0.143924 seconds.
Elapsed time is 0.184360 seconds.

Elapsed time is 0.206405 seconds.
Elapsed time is 0.306424 seconds.

Elapsed time is 1.416654 seconds.
Elapsed time is 2.718846 seconds.

Elapsed time is 2.110420 seconds.
Elapsed time is 4.027782 seconds.
Run Code Online (Sandbox Code Playgroud)

ETA,如果你使用整数而不是双打,看看会发生什么有点有趣:

JIT on,相同的代码,但使用int8转换x

Elapsed time is 0.202201 seconds.
Elapsed time is 0.192103 seconds.

Elapsed time is 0.294974 seconds.
Elapsed time is 0.296191 seconds.

Elapsed time is 2.001245 seconds.
Elapsed time is 2.038713 seconds.

Elapsed time is 0.870500 seconds.
Elapsed time is 0.898301 seconds.
Run Code Online (Sandbox Code Playgroud)

JIT关闭,使用int8

Elapsed time is 0.198611 seconds.
Elapsed time is 0.187589 seconds.

Elapsed time is 0.282775 seconds.
Elapsed time is 0.282938 seconds.

Elapsed time is 1.837561 seconds.
Elapsed time is 1.846766 seconds.

Elapsed time is 2.746034 seconds.
Elapsed time is 2.760067 seconds.
Run Code Online (Sandbox Code Playgroud)


mar*_*sei 6

这可能是由于matlab用于其基本线性代数子程序的一些自动优化.

就像你的一样,我的配置(OSX 10.8.4,带有默认设置的R2012a)需要更长的时间来计算idx1 = x~=0x(10e5元素)而不是x(11e5元素).参见图的左侧面板,其中测量不同矢量尺寸(x轴)的处理时间(y轴).您将看到N> 103000的较低处理时间.在此面板中,我还显示了计算期间处于活动状态的核心数.您将看到单核配置没有丢弃.这意味着matlab不会优化~=1核活动时的执行(不可能并行化).当满足两个条件时,Matlab启用一些优化例程:多个核和足够大小的向量.

右侧面板显示feature('accel','on'/off')设置为关闭(doc)时的结果.这里,只有一个核是活动的(1核和4核是相同的),因此不可能进行优化.

最后,我用于激活/停用核心的功能是maxNumCompThreads.根据Loren Shure的说法,maxNumCompThreads控制着JIT和BLAS.由于feature('JIT','on'/'off')没有在演出中发挥作用,BLAS是剩下的最后一个选择.

我将把最后一句留给Loren:"这里的主要信息是你通常不需要使用这个函数[maxNumCompThreads]!为什么?因为我们想让MATLAB为你做最好的工作."在此输入图像描述

accel = {'on';'off'};
figure('Color','w');
N = 100000:1000:105000;

for ind_accel = 2:-1:1
    eval(['feature(''accel'',''' accel{ind_accel} ''')']);
    tElapsed = zeros(4,length(N));
    for ind_core = 1:4
        maxNumCompThreads(ind_core);
        n_core = maxNumCompThreads;
        for ii = 1:length(N)
            fprintf('core asked: %d(true:%d) - N:%d\n',ind_core,n_core, ii);
            x =  round(rand(N(ii),1)*5)-2;
            idx1 = x~=0;
            tStart = tic;
            for t = 1:5000
                idx1 = x~=0;
            end
            tElapsed(ind_core,ii) = toc(tStart);
        end
    end
    h2 = subplot(1,2,ind_accel);
    plot(N, tElapsed,'-o','MarkerSize',10);
    legend({('1':'4')'});
    xlabel('Vector size','FontSize',14);
    ylabel('Processing time','FontSize',14);
    set(gca,'FontSize',14,'YLim',[0.2 0.7]);
    title(['accel ' accel{ind_accel}]);
end
Run Code Online (Sandbox Code Playgroud)