我想检查boost::variant在我的代码中应用的程序集输出,以便查看哪些中间调用被优化掉了.
当我编译以下示例(使用GCC 5.3 g++ -O3 -std=c++14 -S)时,似乎编译器优化了所有内容并直接返回100:
(...)
main:
.LFB9320:
.cfi_startproc
movl $100, %eax
ret
.cfi_endproc
(...)
Run Code Online (Sandbox Code Playgroud)
#include <boost/variant.hpp>
struct Foo
{
int get() { return 100; }
};
struct Bar
{
int get() { return 999; }
};
using Variant = boost::variant<Foo, Bar>;
int run(Variant v)
{
return boost::apply_visitor([](auto& x){return x.get();}, v);
}
int main()
{
Foo f;
return run(f);
}
Run Code Online (Sandbox Code Playgroud)
但是,完整的程序集输出包含的内容远远超过上面的摘录,对我而言,它看起来永远不会被调用.有没有办法告诉GCC/clang删除所有"噪音"并输出程序运行时实际调用的内容?
完整装配输出:
.file "main1.cpp"
.section .rodata.str1.8,"aMS",@progbits,1
.align 8
.LC0:
.string "/opt/boost/include/boost/variant/detail/forced_return.hpp"
.section .rodata.str1.1,"aMS",@progbits,1
.LC1: …Run Code Online (Sandbox Code Playgroud) 使用GCC编译C程序的默认优化级别是-O0.根据GCC文档关闭所有优化.例如:
gcc -O0 test.c
Run Code Online (Sandbox Code Playgroud)
但是,要检查-O0是否真的关闭了所有优化.我执行了这个命令:
gcc -Q -O0 --help=optimizers
Run Code Online (Sandbox Code Playgroud)
在这里,我有点惊讶.我启用了大约50个选项.然后,我使用以下方法检查了传递给gcc的默认参数:
gcc -v
Run Code Online (Sandbox Code Playgroud)
我懂了:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.4-
2ubuntu1~14.04' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --
enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --
program-suffix=-4.8 --enable-shared --enable-linker-build-id --
libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-
gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-
sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-
time=yes --enable-gnu-unique-object --disable-libmudflap --enable-plugin --
with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-
cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-
java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-
jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-
directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-
gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --
with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release
--build=x86_64-linux-gnu …Run Code Online (Sandbox Code Playgroud) c optimization gcc performance-testing compiler-optimization
我在llvm clang Apple LLVM 8.0.0版(clang-800.0.42.1)上反汇编代码:
int main() {
float a=0.151234;
float b=0.2;
float c=a+b;
printf("%f", c);
}
Run Code Online (Sandbox Code Playgroud)
我编译时没有-O规范,但我也试过-O0(给出相同)和-O2(实际上计算值并存储它预先计算)
产生的反汇编如下(我删除了不相关的部分)
-> 0x100000f30 <+0>: pushq %rbp
0x100000f31 <+1>: movq %rsp, %rbp
0x100000f34 <+4>: subq $0x10, %rsp
0x100000f38 <+8>: leaq 0x6d(%rip), %rdi
0x100000f3f <+15>: movss 0x5d(%rip), %xmm0
0x100000f47 <+23>: movss 0x59(%rip), %xmm1
0x100000f4f <+31>: movss %xmm1, -0x4(%rbp)
0x100000f54 <+36>: movss %xmm0, -0x8(%rbp)
0x100000f59 <+41>: movss -0x4(%rbp), %xmm0
0x100000f5e <+46>: addss -0x8(%rbp), %xmm0
0x100000f63 <+51>: movss %xmm0, -0xc(%rbp)
...
Run Code Online (Sandbox Code Playgroud)
显然它正在做以下事情: