Rust 是否将 trait 对象函数调用去虚拟化？

Question

Rust 是否将 trait 对象函数调用去虚拟化？

devirtualize：由于某些保证更改是正确的，将虚拟/多态/间接函数调用更改为静态函数调用 - 来源：我自己

给定一个&dyn ToString使用静态已知类型创建的简单特征对象String：

fn main() {
    let name: &dyn ToString = &String::from("Steve");
    println!("{}", name.to_string());
}

Run Code Online (Sandbox Code Playgroud)

请问直接调用.to_string()使用<String as ToString>::to_string()吗？还是仅通过 trait 的 vtable 间接？如果是间接的，是否可以将这个调用去虚拟化？或者有什么基本的东西阻碍了这种优化？

这个问题的激励代码要复杂得多；它使用异步特征函数，我想知道Box<dyn Future>在某些情况下是否可以优化返回 a 。

Answer 1

Mat*_* M. 9

Rust 是否将 trait 对象函数调用去虚拟化？

不。

Rust 是一种语言，它什么也不做；它只规定了语义。

在这种特定情况下，Rust 语言没有规定去虚拟化，因此允许实现这样做。

目前，唯一稳定的实现是 rustc，它带有 LLVM 后端——不过如果你喜欢冒险，你可以使用起重机升降机后端。

您可以在操场上测试此实现的代码并选择“显示 LLVM IR”而不是“运行”，以及“发布”而不是“调试”，您应该能够检查是否没有虚拟调用。

代码的修订版将强制转换为 trait + 动态调用以使其更容易：

#[inline(never)]
fn to_string(s: &String) -> String {
    let name: &dyn ToString = s;
    name.to_string()
}

fn main() {
    let name = String::from("Steve");
    let name = to_string(&name);
    println!("{}", name);
}

Run Code Online (Sandbox Code Playgroud)

在操场上运行时会产生以下结果：

; playground::to_string
; Function Attrs: noinline nonlazybind uwtable
define internal fastcc void @_ZN10playground9to_string17h4a25abbd46fc29d4E(%"std::string::String"* noalias nocapture dereferenceable(24) %0, %"std::string::String"* noalias readonly align 8 dereferenceable(24) %s) unnamed_addr #0 {
start:
; call <alloc::string::String as core::clone::Clone>::clone
  tail call void @"_ZN60_$LT$alloc..string..String$u20$as$u20$core..clone..Clone$GT$5clone17h1e3037d7443348baE"(%"std::string::String"* noalias nocapture nonnull sret dereferenceable(24) %0, %"std::string::String"* noalias nonnull readonly align 8 dereferenceable(24) %s)
  ret void
}

Run Code Online (Sandbox Code Playgroud)

您可以清楚地看到调用 toToString::to_string已被简单的调用替换为<String as Clone>::clone; 一个去虚拟化的调用。

这个问题的激励代码要复杂得多；它使用异步特征函数，我想知道Box<dyn Future>在某些情况下是否可以优化返回 a 。

不幸的是，你不能从上面的例子中得出任何结论。

优化很挑剔。从本质上讲，大多数优化类似于使用正则表达式的模式匹配+替换：对人类来说无害的差异可能会完全抛弃模式匹配并阻止优化应用。

确定优化适用于您的情况的唯一方法（如果重要）是检查发出的程序集。

但是，实际上，在这种情况下，我更担心内存分配而不是虚拟调用。虚拟调用大约有 5ns 的开销——尽管它确实抑制了许多优化——而内存分配（以及最终的释放）通常会花费 20ns - 30ns。

如果 rustc/LLVM 有足够的上下文来消除虚拟调用，它也可以消除分配；至少这就是[简单示例](https://www.reddit.com/r/rust/comments/eccrwr/why_rust_closures_are_somewhat_hard/fbat1ta/?utm_source=reddit&utm_medium=web2x&context=3)中发生的情况。我并不是说OP应该依赖于此，但至少希望分配将随着虚拟调用而被消除，并且最终不会成为破坏性能的东西。 (3认同)

归档时间：	5 年，2 月前
查看次数：	465 次
最近记录：	5 年，2 月前