下面是C++实现,它比较了Eigen和For Loop执行矩阵 - 矩阵产品所用的时间.For循环已经过优化,可最大限度地减少缓存未命中.for循环最初比Eigen快,但最终变得更慢(500乘500矩阵可达2倍).我还应该做些什么才能与Eigen竞争?阻止了更好的本征性能的原因?如果是这样,我应该如何为for循环添加阻塞?
#include<iostream>
#include<Eigen/Dense>
#include<ctime>
int main(int argc, char* argv[]) {
srand(time(NULL));
// Input the size of the matrix from the user
int N = atoi(argv[1]);
int M = N*N;
// The matrices stored as row-wise vectors
double a[M];
double b[M];
double c[M];
// Initializing Eigen Matrices
Eigen::MatrixXd a_E = Eigen::MatrixXd::Random(N,N);
Eigen::MatrixXd b_E = Eigen::MatrixXd::Random(N,N);
Eigen::MatrixXd c_E(N,N);
double CPS = CLOCKS_PER_SEC;
clock_t start, end;
// Matrix vector product by Eigen
start = clock();
c_E = a_E*b_E;
end = …Run Code Online (Sandbox Code Playgroud)