在parallel_for（Inter TBB）上是否有类似于我们在std :: function上看到的开销？

Viv*_*nda 5 c++ tbb c++11

在此std :: function vs template链接中，对std :: function的开销进行了很好的讨论。基本上，为避免传递给std :: function构造函数的函子的堆分配导致10倍的开销，必须使用std :: ref或std :: cref。

来自@CassioNeri答案的示例显示了如何通过引用将lambdas传递给std :: function。

float foo(std::function<float(float)> f) { return -1.0f * f(3.3f) + 666.0f; }
foo(std::cref([a,b,c](float arg){ return arg * 0.5f; }));

Run Code Online (Sandbox Code Playgroud)

现在，英特尔线程构建模块库使您能够使用lambda / functor并行评估循环，如下例所示。

示例代码：

#include "tbb/task_scheduler_init.h"
#include "tbb/blocked_range.h"
#include "tbb/parallel_for.h"
#include "tbb/tbb_thread.h"
#include <vector>

int main() {
 tbb::task_scheduler_init init(tbb::tbb_thread::hardware_concurrency());
 std::vector<double> a(1000);
 std::vector<double> c(1000);
 std::vector<double> b(1000);

 std::fill(b.begin(), b.end(), 1);
 std::fill(c.begin(), c.end(), 1);

 auto f = [&](const tbb::blocked_range<size_t>& r) {
  for(size_t j=r.begin(); j!=r.end(); ++j) a[j] = b[j] + c[j];    
 };
 tbb::parallel_for(tbb::blocked_range<size_t>(0, 1000), f);
 return 0;
}

Run Code Online (Sandbox Code Playgroud)

所以我的问题是：英特尔TBB parallel_for是否具有与我们在std :: function上看到的相同类型的开销（函子的堆分配）？我是否应该使用std :: cref通过对parallel_for的引用来传递函子/ lambda，以加速代码？

归档时间：	12 年前
查看次数：	806 次
最近记录：	12 年前