ostream_iterator vs每个循环效率

Sai*_*rmo 10 c++ performance foreach iterator

昨天看到这个用户发帖.我认为这是输出矢量的一种很酷的方式.所以我输入了一个例子并问自己这与for each循环相比如何?

template <typename T>
void printVectorO(std::vector<T> &v)
{
    std::cout << "Ostream_iterator contents: " << std::endl;
    auto start = std::chrono::high_resolution_clock::now();
    std::ostream_iterator<T> ost(std::cout, " ");
    std::copy(begin(v), end(v), ost);
    std::cout << std::endl;

    auto end = std::chrono::high_resolution_clock::now();
    auto time = end - start;
    auto nano = std::chrono::duration_cast<std::chrono::nanoseconds>(time);
    std::cout << "Ostream_iterator computation took: " << nano.count() << " nano seconds"<< std::endl;
    std::cout << std::endl;
}

template <typename T>
void printVectorC(std::vector<T> &v)
{
    std::cout << "For Each Loop contents: " << std::endl;
    auto start = std::chrono::high_resolution_clock::now();
    for (auto && e : v) std::cout << e << " ";
    std::cout << std::endl;

    auto end = std::chrono::high_resolution_clock::now();
    auto time = end - start;
    auto nano = std::chrono::duration_cast<std::chrono::nanoseconds>(time);
    std::cout << "For Each Loop took: " << nano.count() << " nano seconds" << std::endl;
    std::cout << std::endl;
}
Run Code Online (Sandbox Code Playgroud)

我用3个向量来测试这个:

std::vector<double> doubles = { 3.15, 2.17, 2.555, 2.014 };
std::vector<std::string> strings = { "Hi", "how", "are", "you" };
std::vector<int> ints = { 3, 2 , 2 , 2 };
Run Code Online (Sandbox Code Playgroud)

我得到了各种结果.当我输出双打时,for each循环总是跳动ostream_iterator(ex 41856 vs 11207和55198 vs 10308 nanose).有时字符串ostream_iterator跳出for each循环,for each循环ostream_iterator几乎保持颈部和颈部的整数.

为什么是这样?幕后发生了ostream_iterator什么?当我使用ostream_iterator了一个for each循环,当涉及到效率和速度?

Mic*_*ler 3

谨防微基准。

我对这段代码有几个一般性评论:

  1. 将只读变量作为 const 引用而不是常规引用传递。但这并不影响性能
  2. 不要使用 std::endl 因为它调用了lush(),这最终会在这样的微基准测试中占用大部分运行时间。例如,使用 std::endl 打印双精度数需要 37010 ns,而使用 '\n' 则只需要 4456 ns
  3. 单次测量是不准确的。为了消除任何测量噪声,您必须循环运行多次。这仍然不完美,因为最好的办法是交替运行测试(产生随机事件,这会减慢代码速度,以相同的方式影响两种实现)
  4. 最好将其重定向到文件,否则终端速度将主导结果。

这是修正后的基准:

constexpr unsigned ITERATIONS = 1000000;
template <typename T>
void printVectorO(const std::vector<T> &v)
{
    std::cout << "Ostream_iterator contents\n";
    auto start = std::chrono::high_resolution_clock::now();
    for (unsigned i=0 ; i < ITERATIONS; ++i) {
        std::ostream_iterator<T> ost(std::cout, " ");
        std::copy(begin(v), end(v), ost);
        std::cout << '\n';
    }

    auto end = std::chrono::high_resolution_clock::now();
    auto time = end - start;
    auto nano = std::chrono::duration_cast<std::chrono::nanoseconds>(time);
    std::cout << "Ostream_iterator computation took: "
              << nano.count() / ITERATIONS << " nano seconds\n\n";
}

template <typename T>
void printVectorC(const std::vector<T> &v)
{
    std::cout << "For Each Loop contents\n";
    auto start = std::chrono::high_resolution_clock::now();
    for (unsigned i=0 ; i < ITERATIONS ; ++i) {
        for (auto && e : v) std::cout << e << " ";
        std::cout << '\n';
    }

    auto end = std::chrono::high_resolution_clock::now();
    auto time = end - start;
    auto nano = std::chrono::duration_cast<std::chrono::nanoseconds>(time);
    std::cout << "For Each Loop took: "
              << nano.count() / ITERATIONS << " nano seconds\n\n";
}
Run Code Online (Sandbox Code Playgroud)

并通过以下方式调用它:

template <class Container>
void test(const Container & ctr)
{
    printVectorC2(ctr);
    printVectorO2(ctr);
}


int main()
{
    std::vector<double> doubles = { 3.15, 2.17, 2.555, 2.014 };
    test(doubles);
    std::vector<std::string> strings = { "Hi", "how", "are", "you" };
    test(strings);
    std::vector<int> ints = { 3, 2 , 2 , 2 };
    test(ints);
}
Run Code Online (Sandbox Code Playgroud)

现在,在 grep 纳米之后,我们有:

For Each Loop took: 2045 nano seconds
Ostream_iterator computation took: 2033 nano seconds
For Each Loop took: 487 nano seconds
Ostream_iterator computation took: 485 nano seconds
For Each Loop took: 503 nano seconds
Ostream_iterator computation took: 499 nano seconds
Run Code Online (Sandbox Code Playgroud)

几乎没有任何区别。实际上,通过这个特定的运行,ostream 版本似乎更快但再次运行会得到略有不同的结果。