如何强制 g++ 内联 STL 函数

Min*_*Gao 0 c++ gcc stl compiler-optimization

测试代码:

#include <array>

int test(const std::array<int, 10> &arr) {
    return arr[9];
}
Run Code Online (Sandbox Code Playgroud)

我要实现arr[0]像C风格数组一样高效,这意味着内联STL数组[]运算符函数。

我检查了生成汇编代码:

$ g++ --std=c++17  -c test.cpp && objdump -d -C test.o

test.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <test(std::array<int, 10ul> const&)>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 83 ec 10             sub    $0x10,%rsp
   8:   48 89 7d f8             mov    %rdi,0xfffffffffffffff8(%rbp)
   c:   48 8b 45 f8             mov    0xfffffffffffffff8(%rbp),%rax
  10:   be 09 00 00 00          mov    $0x9,%esi
  15:   48 89 c7                mov    %rax,%rdi
  18:   e8 00 00 00 00          callq  1d <test(std::array<int, 10ul> const&)+0x1d>
  1d:   8b 00                   mov    (%rax),%eax
  1f:   c9                      leaveq
  20:   c3                      retq
Disassembly of section .text._ZNKSt5arrayIiLm10EEixEm:

0000000000000000 <std::array<int, 10ul>::operator[](unsigned long) const>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 83 ec 10             sub    $0x10,%rsp
   8:   48 89 7d f8             mov    %rdi,0xfffffffffffffff8(%rbp)
   c:   48 89 75 f0             mov    %rsi,0xfffffffffffffff0(%rbp)
  10:   48 8b 45 f8             mov    0xfffffffffffffff8(%rbp),%rax
  14:   48 8b 55 f0             mov    0xfffffffffffffff0(%rbp),%rdx
  18:   48 89 d6                mov    %rdx,%rsi
  1b:   48 89 c7                mov    %rax,%rdi
  1e:   e8 00 00 00 00          callq  23 <std::array<int, 10ul>::operator[](unsigned long) const+0x23>
  23:   c9                      leaveq
  24:   c3                      retq
Disassembly of section .text._ZNSt14__array_traitsIiLm10EE6_S_refERA10_Kim:

0000000000000000 <std::__array_traits<int, 10ul>::_S_ref(int const (&) [10], unsigned long)>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 89 7d f8             mov    %rdi,0xfffffffffffffff8(%rbp)
   8:   48 89 75 f0             mov    %rsi,0xfffffffffffffff0(%rbp)
   c:   48 8b 45 f0             mov    0xfffffffffffffff0(%rbp),%rax
  10:   48 8d 14 85 00 00 00    lea    0x0(,%rax,4),%rdx
  17:   00
  18:   48 8b 45 f8             mov    0xfffffffffffffff8(%rbp),%rax
  1c:   48 01 d0                add    %rdx,%rax
  1f:   5d                      pop    %rbp
  20:   c3                      retq
Run Code Online (Sandbox Code Playgroud)

arr[9]是生成代码中的函数调用1d <test(std::array<int, 10ul> const&)+0x1d>

如果我指定了优化级别,STL 函数将按预期内联:

$ g++ --std=c++17 -Og -c test.cpp && objdump -d -C test.o

test.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <test(std::array<int, 10ul> const&)>:
   0:   8b 47 24                mov    0x24(%rdi),%eax
   3:   c3                      retq
Run Code Online (Sandbox Code Playgroud)

但我的真实项目是一个大项目,我无法更改全局编译优化标志。所以我想为一些文件指定优化标志。

#pragma GCC optimize ("string"...)所以我在我的程序中添加:

#pragma GCC optimize ("-Og")

#include <array>

int test(const std::array<int, 10> &arr) {
    return arr[9];
}
Run Code Online (Sandbox Code Playgroud)

这个选项确实有一定道理:

$ g++ --std=c++17 -c test.cpp && objdump -d -C test.o

test.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <test(std::array<int, 10ul> const&)>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   be 09 00 00 00          mov    $0x9,%esi
   9:   e8 00 00 00 00          callq  e <test(std::array<int, 10ul> const&)+0xe>
   e:   8b 00                   mov    (%rax),%eax
  10:   48 83 c4 08             add    $0x8,%rsp
  14:   c3                      retq
Disassembly of section .text._ZNKSt5arrayIiLm10EEixEm:

0000000000000000 <std::array<int, 10ul>::operator[](unsigned long) const>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   e8 00 00 00 00          callq  9 <std::array<int, 10ul>::operator[](unsigned long) const+0x9>
   9:   48 83 c4 08             add    $0x8,%rsp
   d:   c3                      retq
Disassembly of section .text._ZNSt14__array_traitsIiLm10EE6_S_refERA10_Kim:

0000000000000000 <std::__array_traits<int, 10ul>::_S_ref(int const (&) [10], unsigned long)>:
   0:   48 8d 04 b7             lea    (%rdi,%rsi,4),%rax
   4:   c3                      retq
Run Code Online (Sandbox Code Playgroud)

<std::array<int, 10ul>::operator[]&函数__array_traits经过了 ard 优化,但我们可以看到,仍然有一个函数调用:callq e <test(std::array<int, 10ul> const&)+0xe>

所以我想知道为什么#pragma GCC optimize ("-Og")会像我预期的那样生效。我想知道如何为指定文件强制内联STL函数?


注:GCC 版本:8.2

rus*_*tyx 6

#pragma GCC optimize ("Og")默认情况下不启用内联,因为不优化时默认为-fno-inline.

用于#pragma GCC optimize ("Og,inline")启用内联。