小编Aur*_*kus的帖子

Matlab + CUDA在求解矩阵向量方程A*x = B方面很慢

我正在计算方程A*x = B,其中A是矩阵,B是矢量,x是答案(未知)矢量.

硬件规格:Intel i7 3630QM(4核),nVidia GeForce GT 640M(384 CUDA核心)

这是一个例子:

>> A=rand(5000);

>> B=rand(5000,1);

>> Agpu=gpuArray(A);

>> Bgpu=gpuArray(B);

>> tic;A\B;toc;

Elapsed time is 1.382281 seconds.

>> tic;Agpu\Bgpu;toc;

Elapsed time is 4.775395 seconds.
Run Code Online (Sandbox Code Playgroud)

不知何故GPU慢得多......为什么?它在FFT,INV,LU计算中也较慢,这应该与矩阵划分有关.

但是,GPU在矩阵乘法(相同的数据)中要快得多:

>> tic;A*B;toc;

Elapsed time is 0.014700 seconds.

>> tic;Agpu*Bgpu;toc;

Elapsed time is 0.000505 seconds.
Run Code Online (Sandbox Code Playgroud)

主要问题是为什么GPU A\B(mldivide)与CPU相比如此之慢?

更新

当A,B(在CPU上),AA,BB(在GPU上)为rand(5000)时,这里有更多结果:

>> tic;fft(A);toc;
Elapsed time is *0.117189 *seconds.
>> tic;fft(AA);toc;
Elapsed time is 1.062969 seconds.
>> tic;fft(AA);toc;
Elapsed time is 0.542242 seconds.
>> tic;fft(AA);toc;
Elapsed time is *0.229773* …
Run Code Online (Sandbox Code Playgroud)

performance matlab cuda matrix linear-algebra

6
推荐指数
1
解决办法
1811
查看次数

标签 统计

cuda ×1

linear-algebra ×1

matlab ×1

matrix ×1

performance ×1