and*_*dge 5 c++ performance multithreading c++11
我正在使用C++中的线程,特别是使用它们来并行化地图操作.
这是代码:
#include <thread>
#include <iostream>
#include <cstdlib>
#include <vector>
#include <math.h>
#include <stdio.h>
double multByTwo(double x){
return x*2;
}
double doJunk(double x){
return cos(pow(sin(x*2),3));
}
template <typename T>
void map(T* data, int n, T (*ptr)(T)){
for (int i=0; i<n; i++)
data[i] = (*ptr)(data[i]);
}
template <typename T>
void parallelMap(T* data, int n, T (*ptr)(T)){
int NUMCORES = 3;
std::vector<std::thread> threads;
for (int i=0; i<NUMCORES; i++)
threads.push_back(std::thread(&map<T>, data + i*n/NUMCORES, n/NUMCORES, ptr));
for (std::thread& t : threads)
t.join();
}
int main()
{
int n = 1000000000;
double* nums = new double[n];
for (int i=0; i<n; i++)
nums[i] = i;
std::cout<<"go"<<std::endl;
clock_t c1 = clock();
struct timespec start, finish;
double elapsed;
clock_gettime(CLOCK_MONOTONIC, &start);
// also try with &doJunk
//parallelMap(nums, n, &multByTwo);
map(nums, n, &doJunk);
std::cout << nums[342] << std::endl;
clock_gettime(CLOCK_MONOTONIC, &finish);
printf("CPU elapsed time is %f seconds\n", double(clock()-c1)/CLOCKS_PER_SEC);
elapsed = (finish.tv_sec - start.tv_sec);
elapsed += (finish.tv_nsec - start.tv_nsec) / 1000000000.0;
printf("Actual elapsed time is %f seconds\n", elapsed);
}
Run Code Online (Sandbox Code Playgroud)
与multByTwo并行版本实际上是稍微慢(1.01秒,而0.95实时),并用其doJunk更快(51对136实时).这对我意味着
只是一个猜测:你可能会看到的是multByTwo代码是如此之快,以至于你实现了内存饱和.无论你向它投入多少处理器能力,代码都不会运行得更快,因为它的速度已经达到了可以从RAM获取的速度.
| 归档时间: |
|
| 查看次数: |
2732 次 |
| 最近记录: |