编辑:性能的巨大差异是由于测试中的错误,正确设置时Eigen速度提高了2到3倍.
我注意到使用C++ Eigen库的稀疏矩阵乘法 比使用Python scipy.sparse库慢得多.我scipy.sparse在~0.03几秒钟内完成了我Eigen在~25几秒钟内取得的成就.也许我在Eigen做错了什么?
这里Python代码:
from scipy import sparse
from time import time
import random as rn
N_VALUES = 200000
N_ROWS = 400000
N_COLS = 400000
rows_a = rn.sample(range(N_COLS), N_VALUES)
cols_a = rn.sample(range(N_ROWS), N_VALUES)
values_a = [rn.uniform(0,1) for _ in xrange(N_VALUES)]
rows_b = rn.sample(range(N_COLS), N_VALUES)
cols_b = rn.sample(range(N_ROWS), N_VALUES)
values_b = [rn.uniform(0,1) for _ in xrange(N_VALUES)]
big_a = sparse.coo_matrix((values_a, (cols_a, rows_a)), shape=(N_ROWS, N_COLS))
big_b = sparse.coo_matrix((values_b, (cols_b, rows_b)), shape=(N_ROWS, N_COLS))
big_a = big_a.tocsr()
big_b = big_a.tocsr()
start = time()
AB = big_a * big_b;
end = time()
print 'time taken : {}'.format(end - start)
Run Code Online (Sandbox Code Playgroud)
C++代码:
#include <iostream>
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <Eigen/Dense>
#include <Eigen/Sparse>
using namespace Eigen;
std::vector<long> gen_random_sample(long min, long max, long sample_size);
double get_random_double(double min, double max);
std::vector<double> get_vector_of_rn_doubles(int length, double min, double max);
int main()
{
long N_COLS = 400000;
long N_ROWS = 400000;
long N_VALUES = 200000;
SparseMatrix<double> big_A(N_ROWS, N_COLS);
std::vector<long> cols_a = gen_random_sample(0, N_COLS, N_VALUES);
std::vector<long> rows_a = gen_random_sample(0, N_COLS, N_VALUES);
std::vector<double> values_a = get_vector_of_rn_doubles(N_VALUES, 0, 1);
for (int i = 0; i < N_VALUES; i++)
big_A.insert(cols_a[i], cols_a[i]) = values_a[i];
// big_A.makeCompressed(); // slows things down
SparseMatrix<double> big_B(N_ROWS, N_COLS);
std::vector<long> cols_b = gen_random_sample(0, N_COLS, N_VALUES);
std::vector<long> rows_b = gen_random_sample(0, N_COLS, N_VALUES);
std::vector<double> values_b = get_vector_of_rn_doubles(N_VALUES, 0, 1);
for (int i = 0; i < N_VALUES; i++)
big_B.insert(cols_b[i], cols_b[i]) = values_b[i];
// big_B.makeCompressed();
SparseMatrix<double> big_AB(N_ROWS, N_COLS);
clock_t begin = clock();
big_AB = (big_A * big_B); //.pruned();
clock_t end = clock();
double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
std::cout << "Time taken : " << elapsed_secs << std::endl;
}
std::vector<long> gen_random_sample(long min, long max, long sample_size)
{
std::vector<long> my_vector(sample_size); // THE BUG, is right std::vector<long> my_vector
for (long i = min; i != max; i++)
{
my_vector.push_back(i);
}
std::random_shuffle(my_vector.begin(), my_vector.end());
std::vector<long> new_vec = std::vector<long>(my_vector.begin(), my_vector.begin() + sample_size);
return new_vec;
}
double get_random_double(double min, double max)
{
std::uniform_real_distribution<double> unif(min, max);
std::default_random_engine re;
double a_random_double = unif(re);
}
std::vector<double> get_vector_of_rn_doubles(int length, double min, double max)
{
std::vector<double> my_vector(length);
for (int i=0; i < length; i++)
{
my_vector[i] = get_random_double(min, max);
}
return my_vector;
}
Run Code Online (Sandbox Code Playgroud)
我编译:g++ -std=c++11 -I/usr/include/eigen3 time_eigen.cpp -o my_exec -O2 -DNDEBUG.
我错过了使用Eigen快速进行稀疏乘法的方法吗?
如果你没有编译-DNDEBUG,那么你会发现你的矩阵已经被破坏了,因为你多次插入相同的元素,而insert方法不允许这样做.
coeffRef(i,j) += value根据文档中的建议替换它们或使用三元组列表.在这个小修复之后,它需要0.012sC++代码,并且0.021s在我的计算机上使用Python.请注意,由于输入矩阵不完全相同,因此无法从这两个数字中真正推断出哪一个更快,但至少它们的顺序相同.
| 归档时间: |
|
| 查看次数: |
3491 次 |
| 最近记录: |