我正在尝试通过编译 Rust 来学习汇编。我找到了一种将 Rust 代码编译为二进制机器代码并能够objdump查看程序集的方法。但是,如果我写以下内容:
#![no_main]
#[link_section = ".text.entry"]
#[no_mangle]
pub extern "C" fn _start() -> ! {
let a: u64 = 4;
let b: u64 = 7;
let c: u64 = a * b;
loop {}
}
Run Code Online (Sandbox Code Playgroud)
我得到的程序集是:
0000000000000000 <.data>:
0: 1101 addi sp,sp,-32
2: 4511 li a0,4
4: e42a sd a0,8(sp)
6: 451d li a0,7
8: e82a sd a0,16(sp)
a: 4571 li a0,28
c: ec2a sd a0,24(sp)
e: a009 j 0x10
10: a001 j 0x10
Run Code Online (Sandbox Code Playgroud)
所以看起来 Rust 正在将 mul 压缩为一个常数。我正在使用以下编译选项:
货物.toml:
[profile.dev]
opt-level = 0
mir-opt-level = 0
Run Code Online (Sandbox Code Playgroud)
有没有办法阻止 Rust 对此进行优化?
发出的 LLVM 看起来像这样:
; Function Attrs: noreturn nounwind
define dso_local void @_start() unnamed_addr #0 section ".text.entry" !dbg !22 {
start:
%c.dbg.spill = alloca i64, align 8
%b.dbg.spill = alloca i64, align 8
%a.dbg.spill = alloca i64, align 8
store i64 4, i64* %a.dbg.spill, align 8, !dbg !36
call void @llvm.dbg.declare(metadata i64* %a.dbg.spill, metadata !28, metadata !DIExpression()), !dbg !37
store i64 7, i64* %b.dbg.spill, align 8, !dbg !38
call void @llvm.dbg.declare(metadata i64* %b.dbg.spill, metadata !31, metadata !DIExpression()), !dbg !39
store i64 28, i64* %c.dbg.spill, align 8, !dbg !40
call void @llvm.dbg.declare(metadata i64* %c.dbg.spill, metadata !33, metadata !DIExpression()), !dbg !41
Run Code Online (Sandbox Code Playgroud)
所以看起来优化是在 LLVM 传递之前进行的。
#![no_main]
#[link_section = ".text.entry"]
#[no_mangle]
pub extern "C" fn _start() -> ! {
let a: u64 = 4;
let b: u64 = 7;
let c: u64 = a * b;
loop {}
}
Run Code Online (Sandbox Code Playgroud)
构建命令:
0000000000000000 <.data>:
0: 1101 addi sp,sp,-32
2: 4511 li a0,4
4: e42a sd a0,8(sp)
6: 451d li a0,7
8: e82a sd a0,16(sp)
a: 4571 li a0,28
c: ec2a sd a0,24(sp)
e: a009 j 0x10
10: a001 j 0x10
Run Code Online (Sandbox Code Playgroud)
构建.rs
fn main() {
println!("cargo:rerun-if-changed=build.rs");
println!("cargo:rustc-link-arg=-Tlink.ld");
}
Run Code Online (Sandbox Code Playgroud)
链接.ld
ENTRY(_start)
SECTIONS {
.text : { *(.text); *(.text.*) }
}
Run Code Online (Sandbox Code Playgroud)
E_n*_*ate 10
在生成 LLVM-IR 之前有一个编译器通道,即生成MIR(Rust 中间表示形式)。如果您使用如下命令为给定代码发出此命令:
cargo rustc -- --emit mir
Run Code Online (Sandbox Code Playgroud)
您将在生成的 .mir 文件中看到优化已经发生。
fn _start() -> ! {
let mut _0: !; // return place in scope 0 at src\main.rs:5:31: 5:32
let _1: u64; // in scope 0 at src\main.rs:6:9: 6:10
scope 1 {
debug a => _1; // in scope 1 at src\main.rs:6:9: 6:10
let _2: u64; // in scope 1 at src\main.rs:7:9: 7:10
scope 2 {
debug b => _2; // in scope 2 at src\main.rs:7:9: 7:10
let _3: u64; // in scope 2 at src\main.rs:8:9: 8:10
scope 3 {
debug c => _3; // in scope 3 at src\main.rs:8:9: 8:10
}
}
}
bb0: {
_1 = const 4_u64; // scope 0 at src\main.rs:6:18: 6:19
_2 = const 7_u64; // scope 1 at src\main.rs:7:18: 7:19
_3 = const 28_u64; // scope 2 at src\main.rs:8:18: 8:23
goto -> bb1; // scope 3 at src\main.rs:10:5: 10:12
}
bb1: {
goto -> bb1; // scope 3 at src\main.rs:10:5: 10:12
}
}
Run Code Online (Sandbox Code Playgroud)
发生这种情况是因为该mir-opt-level选项当前仅作为不稳定的编译器选项存在。它不能作为 Cargo 中的配置文件属性使用。直接调用编译器手动设置它:
cargo rustc -- -Z mir-opt-level=0 --emir mir
Run Code Online (Sandbox Code Playgroud)
而这种优化将会消失:
fn _start() -> ! {
let mut _0: !; // return place in scope 0 at src\main.rs:5:31: 5:32
let mut _1: !; // in scope 0 at src\main.rs:5:33: 11:2
let _2: u64; // in scope 0 at src\main.rs:6:9: 6:10
let mut _5: u64; // in scope 0 at src\main.rs:8:18: 8:19
let mut _6: u64; // in scope 0 at src\main.rs:8:22: 8:23
let mut _7: (u64, bool); // in scope 0 at src\main.rs:8:18: 8:23
let mut _8: !; // in scope 0 at src\main.rs:10:5: 10:12
let mut _9: (); // in scope 0 at src\main.rs:5:1: 11:2
scope 1 {
debug a => _2; // in scope 1 at src\main.rs:6:9: 6:10
let _3: u64; // in scope 1 at src\main.rs:7:9: 7:10
scope 2 {
debug b => _3; // in scope 2 at src\main.rs:7:9: 7:10
let _4: u64; // in scope 2 at src\main.rs:8:9: 8:10
scope 3 {
debug c => _4; // in scope 3 at src\main.rs:8:9: 8:10
}
}
}
bb0: {
StorageLive(_1); // scope 0 at src\main.rs:5:33: 11:2
StorageLive(_2); // scope 0 at src\main.rs:6:9: 6:10
_2 = const 4_u64; // scope 0 at src\main.rs:6:18: 6:19
StorageLive(_3); // scope 1 at src\main.rs:7:9: 7:10
_3 = const 7_u64; // scope 1 at src\main.rs:7:18: 7:19
StorageLive(_4); // scope 2 at src\main.rs:8:9: 8:10
StorageLive(_5); // scope 2 at src\main.rs:8:18: 8:19
_5 = _2; // scope 2 at src\main.rs:8:18: 8:19
StorageLive(_6); // scope 2 at src\main.rs:8:22: 8:23
_6 = _3; // scope 2 at src\main.rs:8:22: 8:23
_7 = CheckedMul(_5, _6); // scope 2 at src\main.rs:8:18: 8:23
assert(!move (_7.1: bool), "attempt to compute `{} * {}`, which would overflow", move _5, move _6) -> bb1; // scope 2 at src\main.rs:8:18: 8:23
}
bb1: {
_4 = move (_7.0: u64); // scope 2 at src\main.rs:8:18: 8:23
StorageDead(_6); // scope 2 at src\main.rs:8:22: 8:23
StorageDead(_5); // scope 2 at src\main.rs:8:22: 8:23
StorageLive(_8); // scope 3 at src\main.rs:10:5: 10:12
goto -> bb2; // scope 3 at src\main.rs:10:5: 10:12
}
bb2: {
_9 = const (); // scope 3 at src\main.rs:10:10: 10:12
goto -> bb2; // scope 3 at src\main.rs:10:5: 10:12
}
}
Run Code Online (Sandbox Code Playgroud)
这可能是您在不直接接触 LLVM 的情况下所能达到的极限。代码特定部分的一些优化也可以通过诸如black_box.
也可以看看: