小编Kim*_*ter的帖子

比较算法的执行时间:为什么执行顺序很重要？

每当我尝试比较两个竞争算法(使用C++)的执行时间时,我std::chrono就像以前在这个问题中建议的那样使用:测量C++中函数的执行时间

但是,我始终注意到所比较的算法的执行顺序会显着影响执行时间.它甚至经常改变哪种竞争算法被认为是最快的.例如,假设我有两个算法algo1和algo2.

我的意思是下面的代码:

std::chrono::high_resolution_clock::time_point start0, start1;
std::chrono::high_resolution_clock::time_point end0, end1;

start1 = std::chrono::high_resolution_clock::now();
algo1();
end1 = std::chrono::high_resolution_clock::now();

start2 = std::chrono::high_resolution_clock::now();
algo2();
end2 = std::chrono::high_resolution_clock::now();

auto time_elapsed1 = std::chrono::duration_cast<std::chrono::nanoseconds>(end1 - start1).count();
auto time_elapsed2 = std::chrono::duration_cast<std::chrono::nanoseconds>(end2 - start2).count();

Run Code Online (Sandbox Code Playgroud)

从以下代码中得到不同的结果:

std::chrono::high_resolution_clock::time_point start0, start1;
std::chrono::high_resolution_clock::time_point end0, end1;

start2 = std::chrono::high_resolution_clock::now();
algo2();
end2 = std::chrono::high_resolution_clock::now();

start1 = std::chrono::high_resolution_clock::now();
algo1();
end1 = std::chrono::high_resolution_clock::now();

auto time_elapsed1 = std::chrono::duration_cast<std::chrono::nanoseconds>(end1 - start1).count();
auto time_elapsed2 = std::chrono::duration_cast<std::chrono::nanoseconds>(end2 - start2).count();

Run Code Online (Sandbox Code Playgroud)

对于我可能想要比较的几乎所有算法1和2而言.

所以,我的问题是双重的:1)为什么会这样,即为什么订单很重要？2)是否有更好的方法来比较两种算法的执行时间,即如何进行更好和更准确的比较？

PS:当然,我总是测试所有编译器的优化.

c++ algorithm profiling performance-testing

Kim*_*ter

2017 05-23

5
推荐指数

1
解决办法

196
查看次数

在 C++ 中使用 SSE2 SIMD 对两个数组求和的正确方法

让我们首先包括以下内容：

#include <vector>
#include <random>
using namespace std;

Run Code Online (Sandbox Code Playgroud)

现在，假设一个人有以下三个std:vector<float>：

N = 1048576;
vector<float> a(N);
vector<float> b(N);
vector<float> c(N);

default_random_engine randomGenerator(time(0));
uniform_real_distribution<float> diceroll(0.0f, 1.0f);
for(int i-0; i<N; i++)
{
    a[i] = diceroll(randomGenerator);
    b[i] = diceroll(randomGenerator);
}

Run Code Online (Sandbox Code Playgroud)

现在，假设需要按a元素b求和并将结果存储在中c，其标量形式如下所示：

for(int i=0; i<N; i++)
{
    c[i] = a[i] + b[i];
}

Run Code Online (Sandbox Code Playgroud)

上述代码的 SSE2 矢量化版本是什么，请记住输入是a和b如上面定义的（即作为的集合float）并且输出是c（也是的集合float）？

经过大量研究后，我得出以下结论：

for(int i=0; i<N; i+=4)
{
    float a_toload[4] = …

Run Code Online (Sandbox Code Playgroud)

c++ arrays sse sum simd

Kim*_*ter

2016 09-29

2
推荐指数

1
解决办法

6739
查看次数

标签统计

c++ ×2

algorithm ×1

arrays ×1

performance-testing ×1

profiling ×1

simd ×1

sse ×1

sum ×1

比较算法的执行时间:为什么执行顺序很重要？

在 C++ 中使用 SSE2 SIMD 对两个数组求和的正确方法

标签 统计

小编Kim_ter的帖子

标签统计