gcc用memcpy和memset替换循环

Question

gcc用memcpy和memset替换循环

我有以下简单程序：

#define N 20
long c[N];
long a[N + N];

void f(void)
{
    long *s = c;
    long *p = a;
    while (p != a + N) *p++ = *s++;
    while (p != a + N + N) *p++ = 0;
}

Run Code Online (Sandbox Code Playgroud)

我用它编译：

/usr/gcc-arm-none-eabi-5_4-2016q3/bin/arm-none-eabi-gcc -mthumb -O3 -o main.o -c main.c

Run Code Online (Sandbox Code Playgroud)

gcc方便地用memcpy和memset分别替换循环：

00000000 <f>:
   0:   b570            push    {r4, r5, r6, lr}
   2:   4d07            ldr     r5, [pc, #28]   ; (20 <f+0x20>)
   4:   4c07            ldr     r4, [pc, #28]   ; (24 <f+0x24>)
   6:   002a            movs    r2, r5
   8:   4907            ldr     r1, [pc, #28]   ; (28 <f+0x28>)
   a:   0020            movs    r0, r4
   c:   f7ff fffe       bl      0 <memcpy>
  10:   1960            adds    r0, r4, r5
  12:   002a            movs    r2, r5
  14:   2100            movs    r1, #0
  16:   f7ff fffe       bl      0 <memset>
  1a:   bc70            pop     {r4, r5, r6}
  1c:   bc01            pop     {r0}
  1e:   4700            bx      r0

Run Code Online (Sandbox Code Playgroud)

显然，gcc很聪明，并且认为库的实现效率更高，在每种情况下都可能会或可能不会。我想知道当例如速度不重要而库调用不理想时如何避免这种行为。

Answer 1

A.K*_*.K. 8

好的，搜索https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html会显示以下选项：

-ftree-loop-distribute-patterns
Run Code Online (Sandbox Code Playgroud)
执行模式的循环分发，这些模式可以是通过调用库生成的代码。默认情况下，此标记在-O3启用。

指定-fno-tree-loop-distribute-patterns避免在不影响其他优化的情况下接触标准库。

Answer 2

Mr.*_*bug 0

您正在使用标志 -O3，它强制编译器运行所有可用的优化方法，尝试较低的值，例如 -O2 或 -O。

归档时间：	8 年，7 月前
查看次数：	912 次
最近记录：	6 年，9 月前