为什么使用默认构造函数“{}”而不是“= default”会导致性能变化？

Question

为什么使用默认构造函数“{}”而不是“= default”会导致性能变化？

Ell*_*ith 1 c++ performance constructor default-constructor compiler-optimization

我最近注意到我的性能受到了影响，因为我声明了一个默认构造函数，例如：

Foo() = default;

Run Code Online (Sandbox Code Playgroud)

代替

Foo() {}

Run Code Online (Sandbox Code Playgroud)

（仅供参考，我需要明确声明它，因为我还有一个可变参数构造函数，否则会覆盖默认构造函数）

这对我来说似乎很奇怪，因为我认为这两行代码是相同的（好吧，只要默认构造函数是可能的。如果默认构造函数不可用，第二行代码会产生错误，第一行会产生错误）隐式删除默认构造函数。'不是我的情况！）。

好的，所以我做了一个小测试器，结果根据编译器的不同而有很大差异，但是在某些设置下，我得到了一致的结果，一个比另一个更快：

#include <chrono>

template <typename T>
double TimeDefaultConstructor (int n_iterations)
{
    auto start_time = std::chrono::system_clock::now();

    for (int i = 0; i < n_iterations; ++i)
        T t;

    auto end_time = std::chrono::system_clock::now();

    std::chrono::duration<double> elapsed_seconds = end_time - start_time;

    return elapsed_seconds.count();
}

template <typename T, typename S>
double CompareDefaultConstructors (int n_comparisons, int n_iterations)
{
    int n_comparisons_with_T_faster = 0;

    for (int i = 0; i < n_comparisons; ++i)
    {
        double time_for_T = TimeDefaultConstructor<T>(n_iterations);
        double time_for_S = TimeDefaultConstructor<S>(n_iterations);

        if (time_for_T < time_for_S)    
            ++n_comparisons_with_T_faster;  
    }

    return (double) n_comparisons_with_T_faster / n_comparisons;
}


#include <vector>

template <typename T>
struct Foo
{
    std::vector<T> data_;

    Foo() = default;
};

template <typename T>
struct Bar
{
    std::vector<T> data_;

    Bar() {};
};

#include <iostream>

int main ()
{
    int n_comparisons = 10000;
    int n_iterations = 10000;

    typedef int T;

    double result = CompareDefaultConstructors<Foo<T>,Bar<T>> (n_comparisons, n_iterations);

    std::cout << "With " << n_comparisons << " comparisons of " << n_iterations
        << " iterations of the default constructor, Foo<" << typeid(T).name() << "> was faster than Bar<" << typeid(T).name() << "> "
        << result*100 << "% of the time" << std::endl;

    std::cout << "swapping orientation:" << std::endl;

    result = CompareDefaultConstructors<Bar<T>,Foo<T>> (n_comparisons, n_iterations);

    std::cout << "With " << n_comparisons << " comparisons of " << n_iterations
        << " iterations of the default constructor, Bar<" << typeid(T).name() << "> was faster than Foo<" << typeid(T).name() << "> "
        << result*100 << "% of the time" << std::endl;

    return 0;
}

Run Code Online (Sandbox Code Playgroud)

使用上述程序，g++ -std=c++11我始终得到类似于以下内容的输出：

通过对默认构造函数的 10000 次迭代进行 10000 次比较，Foo 比 Bar 快 4.69% 的时间交换方向：通过对默认构造函数的 10000 次迭代进行 10000 次比较，Bar 比 Foo 快 96.23%

更改编译器设置似乎会改变结果，有时会完全翻转它。但我无法理解的是为什么这很重要？

Answer 1

Evg*_*Evg 7

该基准不衡量它应该衡量的内容。替换Bar() {};为Bar() = default;制作Foo和Bar相同，您将得到相同的结果：

通过对默认构造函数的 10000 次迭代进行 10000 次比较，Foo 在 69.89% 的时间交换方向上比 Bar 快：通过对默认构造函数的 10000 次迭代进行 10000 次比较，Bar 比 Foo 快 29.9% 的时间

这是一个生动的演示，说明您测量的不是构造函数，而是其他东西。

当您启用-O1优化时，for循环T t;退化为¹：

        test    ebx, ebx
        jle     .L3
        mov     eax, 0
.L4:
        add     eax, 1
        cmp     ebx, eax
        jne     .L4
.L3:

Run Code Online (Sandbox Code Playgroud)

对于Foo和Bar。也就是说，进入一个简单的for (int i = 0; i < n_iterations; ++i);循环。

当您启用-O2或-O3完全优化时。

如果没有优化 ( -O0)，您将获得以下程序集：

        mov     DWORD PTR [rbp-4], 0
.L35:
        mov     eax, DWORD PTR [rbp-4]
        cmp     eax, DWORD PTR [rbp-68]
        jge     .L34
        lea     rax, [rbp-64]
        mov     rdi, rax
        call    Foo<int>::Foo()
        lea     rax, [rbp-64]
        mov     rdi, rax
        call    Foo<int>::~Foo()
        add     DWORD PTR [rbp-4], 1
        jmp     .L35
.L34:

Run Code Online (Sandbox Code Playgroud)

Bar与Foo替换为相同Bar。

现在让我们来看看构造函数：

Foo<int>::Foo()
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     QWORD PTR [rbp-8], rdi
        mov     rax, QWORD PTR [rbp-8]
        mov     rdi, rax
        call    std::vector<int, std::allocator<int> >::vector()
        nop
        leave
        ret

Run Code Online (Sandbox Code Playgroud)

和

Bar<int>::Bar()
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     QWORD PTR [rbp-8], rdi
        mov     rax, QWORD PTR [rbp-8]
        mov     rdi, rax
        call    std::vector<int, std::allocator<int> >::vector()
        nop
        leave
        ret

Run Code Online (Sandbox Code Playgroud)

如您所见，这些也是相同的。

¹海湾合作委员会 8.3

@Elliott-ReinstateMonica，它们之前是相同的，之后也是相同的。即使是相同的汇编代码在现代 CPU 上的计时也可能不同。 (2认同)
@Elliott-ReinstateMonica，“https://godbolt.org”将是你的好朋友。 (2认同)
@Elliott：仅供参考：它们并不相同。`= default` 构造函数可能很简单（取决于成员子对象），而 `{}` 构造函数*永远*不会。 (2认同)

归档时间：	5 年，11 月前
查看次数：	206 次
最近记录：	5 年，11 月前