std :: bind vs lambda performance

Ada*_*rek 12 c++ lambda caching bind c++11

我想为一些函数执行时间,我自己写了一个帮手:

using namespace std;
template<int N = 1, class Fun, class... Args>
void timeExec(string name, Fun fun, Args... args) {

    auto start = chrono::steady_clock::now();

    for(int i = 0; i < N; ++i) {
        fun(args...);
    }

    auto end = chrono::steady_clock::now();

    auto diff = end - start;
    cout << name << ": "<< chrono::duration<double, milli>(diff).count() << " ms. << endl;
}
Run Code Online (Sandbox Code Playgroud)

我认为对于计时成员函数这种方式我必须使用bind或lambda,我想看看哪个会影响性能,所以我做了:

const int TIMES = 10000;
timeExec<TIMES>("Bind evaluation", bind(&decltype(result)::eval, &result));
timeExec<1>("Lambda evaluation", [&]() {
    for(int i = 0; i < TIMES; ++i) {
        result.eval();
    }
});
Run Code Online (Sandbox Code Playgroud)

结果是:

Bind evaluation: 0.355158 ms.
Lambda evaluation: 0.014414 ms.
Run Code Online (Sandbox Code Playgroud)

我不知道内部,但我认为lambda不能比绑定更好.我能想到的唯一合理的解释是编译器优化了lambda循环中的后续函数求值.

你会如何解释它?

Pot*_*ter 15

我认为lambda不能比bind更好.

这是一个相当的先入为主.

Lambdas与编译器内部联系在一起,因此可以找到额外的优化机会.而且,它们旨在避免效率低下.

但是,这里可能没有编译器优化技巧.可能的罪魁祸首是绑定的论据,bind(&decltype(result)::eval, &result).您正在传递指向成员函数(PTMF)和对象的指针.与lambda类型不同,PTMF不捕获实际调用的函数; 它只包含函数签名(参数和返回类型).慢循环使用间接分支函数调用,因为编译器无法通过常量传播来解析函数指针.

如果重命名成员eval()operator () (),摆脱bind,那么明确的对象将主要表现得像拉姆达和性能上的差异就会消失.

  • 是否有人对此进行基准测试以验证它? (3认同)

小智 8

我测试过了.我的结果表明,Lambda实际上比绑定更快.

这是代码(请不要看样式):

#include <iostream>
#include <functional>
#include <chrono>

using namespace std;
using namespace chrono;
using namespace placeholders;

typedef void SumDataBlockEventHandler(uint8_t data[], uint16_t len);

class SpeedTest {
    uint32_t sum = 0;
    uint8_t i = 0;
    void SumDataBlock(uint8_t data[], uint16_t len) {
        for (i = 0; i < len; i++) {
            sum += data[i];
        }
    }
public:
    function<SumDataBlockEventHandler> Bind() {
        return bind(&SpeedTest::SumDataBlock, this, _1, _2);
    }
    function<SumDataBlockEventHandler> Lambda() {
        return [this](auto data, auto len)
        {
            SumDataBlock(data, len);
        };
    }
};

int main()
{
    SpeedTest test;
    function<SumDataBlockEventHandler> testF;
    uint8_t data[] = { 0,1,2,3,4,5,6,7 };

#if _DEBUG
    const uint32_t testFcallCount = 1000000;
#else
    const uint32_t testFcallCount = 100000000;
#endif
    uint32_t callsCount, whileCount = 0;
    auto begin = high_resolution_clock::now();
    auto end = begin;

    while (whileCount++ < 10) {
        testF = test.Bind();
        begin = high_resolution_clock::now();
        callsCount = 0;
        while (callsCount++ < testFcallCount)
            testF(data, 8);
        end = high_resolution_clock::now();
        cout << testFcallCount << " calls of binded function: " << duration_cast<nanoseconds>(end - begin).count() << "ns" << endl;

        testF = test.Lambda();
        begin = high_resolution_clock::now();
        callsCount = 0;
        while (callsCount++ < testFcallCount)
            testF(data, 8);
        end = high_resolution_clock::now();
        cout << testFcallCount << " calls of lambda function: " << duration_cast<nanoseconds>(end - begin).count() << "ns" << endl << endl;
    }
    system("pause");
}
Run Code Online (Sandbox Code Playgroud)

控制台结果(最优化发布):

100000000 calls of binded function: 1846298524ns
100000000 calls of lambda function: 1048086461ns

100000000 calls of binded function: 1259759880ns
100000000 calls of lambda function: 1032256243ns

100000000 calls of binded function: 1264817832ns
100000000 calls of lambda function: 1039052353ns

100000000 calls of binded function: 1263404007ns
100000000 calls of lambda function: 1031216018ns

100000000 calls of binded function: 1275305794ns
100000000 calls of lambda function: 1041313446ns

100000000 calls of binded function: 1256565304ns
100000000 calls of lambda function: 1031961675ns

100000000 calls of binded function: 1248132135ns
100000000 calls of lambda function: 1033890224ns

100000000 calls of binded function: 1252277130ns
100000000 calls of lambda function: 1042336736ns

100000000 calls of binded function: 1250320869ns
100000000 calls of lambda function: 1046529458ns
Run Code Online (Sandbox Code Playgroud)

我已经在具有完全优化(/ Ox)的发布模式下在Visual Studio Enterprise 2015下编译它,并在具有禁用优化的调试模式下编译它.结果证实lambda比我的笔记本电脑上的绑定更快(戴尔Inspiron 7537,英特尔酷睿i7-4510U 2.00GHz,8GB RAM).

有人可以在您的计算机上验证吗?

  • OnIntel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz - 32GB Ram - clang++-4.0 libstdc++ lambda 是绑定时间的 1/3。使用 -O3 编译 bind 和 lambda 几乎相等,但 bind 仍然慢 5%。没有循环展开的编译确认了结果。 (3认同)