如何理解棘手的加速

Question

如何理解棘手的加速

对不起,可能是太抽象的问题,但对我来说这是非常实用+可能是一些专家有类似的经验,可以解释它.

我有一个大代码,大约10000行.

我注意到如果在某个地方我放了

if ( expression ) continue;

Run Code Online (Sandbox Code Playgroud)

其中expression 始终为false(使用代码逻辑和cout进行双重检查),但取决于未知参数(因此编译器不能简单地在编译期间摆脱此行),程序的速度提高了25%(计算结果)是相同的).如果我测量环路本身的速度,则加速因子大于3.

为什么会发生这种情况？如果没有这些技巧,有什么方法可以使用这种加速的可能性？

PS我使用gcc 4.7.3,-O3优化.

更多信息:

我尝试过两种不同的表达方式.
如果我将行更改为:
```
if ( expression ) { cout << " HELLO " << endl; continue; };
```
Run Code Online (Sandbox Code Playgroud)
加速消失了.
如果我将行更改为:
```
expression;
```
Run Code Online (Sandbox Code Playgroud)
加速消失了.

围绕该行的代码如下所示:

for ( int i = a; ;  ) {
  do {
    i += d;
    if ( d*i > d*ilast ) break;

      // small amount of calculations, and conditional calls of continue;

  } while ( expression0 );
  if ( d*i > dir*ilast ) break;

  if ( expression ) continue;

   // very big amount calculations, and conditional calls of continue;

}

Run Code Online (Sandbox Code Playgroud)

for循环看起来很奇怪.这是因为我修改了环以便抓住这个瓶颈.最初表达式等于表达式0而不是do-loop我只有这个继续.

我尝试使用__builtin_expect来理解分支预测.同

  // the expression (= false) is supposed to be true by branch prediction.
if ( __builtin_expect( !!(expression), 1) ) continue;

Run Code Online (Sandbox Code Playgroud)

加速是25%.

  // the expression (= false) is supposed to be false by branch prediction.
if ( __builtin_expect( !!(expression), 0) ) continue;

Run Code Online (Sandbox Code Playgroud)

加速消失了.

如果我使用-O2而不是-O3,效果就会消失.代码比具有错误条件的快速O3版本稍微(~3%)慢.
"-O2 -finline-functions -funswitch-loops -fpredictive-commoning -fgcse-after-reload -ftree-vectorize"也是如此.还有一个选项:"-O2 -finline-functions -funswitch-loops -fpredictive-commoning -fgcse-after-reload -ftree-vectorize -fipa-cp-clone",效果被放大.使用"线"速度相同,没有"线"代码慢75%.

原因在于仅遵循条件运算符.所以代码看起来像这样:

for ( int i = a; ;  ) {

      // small amount of calculations, and conditional calls of continue;

  if ( expression ) continue;

    // calculations1

  if ( expression2 ) {
    // calculations2
  }

   // very big amount calculations, and conditional calls of continue;

}

Run Code Online (Sandbox Code Playgroud)

expression2的值几乎总是假的.所以我改变了这样:

for ( int i = a; ;  ) {

      // small amount of calculations, and conditional calls of continue;

  // if ( expression ) continue; // don't need this anymore

    // calculations1

  if ( __builtin_expect( !!(expression2), 0 ) ) { // suppose expression2 == false
    // calculations2
  }

   // very big amount calculations, and conditional calls of continue;

}

Run Code Online (Sandbox Code Playgroud)

并且希望加速25%.甚至更多一点.行为不再取决于关键线.

如果有人知道材料,这可以解释这种行为而没有猜测,我会很高兴阅读并接受他们的答案.

Answer 1

klm*_*123 3

找到了。

原因在于下面的条件运算符。所以代码看起来像这样：

for ( int i = a; ;  ) {

      // small amount of calculations, and conditional calls of continue;

  if ( expression ) continue;

    // calculations1

  if ( expression2 ) {
    // calculations2
  }

   // very big amount calculations, and conditional calls of continue;

}

Run Code Online (Sandbox Code Playgroud)

expression2 的值几乎总是 false。所以我把它改成这样：

for ( int i = a; ;  ) {

      // small amount of calculations, and conditional calls of continue;

  // if ( expression ) continue; // don't need this anymore

    // calculations1

  if ( __builtin_expect( !!(expression2), 0 ) ) { // suppose expression2 == false
    // calculations2
  }

   // very big amount calculations, and conditional calls of continue;

}

Run Code Online (Sandbox Code Playgroud)

并获得了预期的 25% 加速。甚至还有一点点。并且行为不再取决于临界线。

我不知道如何解释它，也找不到足够的关于分支预测的材料。

但我想重点是应跳过calculations2，但编译器不知道这一点并假设默认情况下 expression2 == true 。同时它假设在简单的继续检查中

if ( expression ) continue;

Run Code Online (Sandbox Code Playgroud)

expression == false，并且很好地跳过了在任何情况下都必须完成的计算2。如果我们有更复杂的操作（例如 cout），它会假设表达式为真并且该技巧不起作用。

如果有人知道材料，可以解释这种行为而无需猜测，我将很高兴阅读并接受他们的答案。

归档时间：	12 年，7 月前
查看次数：	479 次
最近记录：	12 年，7 月前