快速单次调度以解决运行时的多次调度

Question

快速单次调度以解决运行时的多次调度

Bat*_*aBe 5 dynamic single-dispatch julia

当类型推断出现问题时（::Any在@code_warntype打印输出中），我的理解是函数调用是动态调度的。换句话说，在运行时，将检查参数的类型以查找MethodInstance具体参数类型的特化 ( )。需要在运行时而不是编译时执行此操作会产生性能成本。

（编辑：最初，我在类型检查和专业化查找之间说“多重分派找到合适的方法”，但我实际上不知道这部分是否在运行时发生。似乎只有在没有有效的情况下才需要发生专业化是存在的，需要进行编译。）

在只需要检查一个参数的具体类型的情况下，是否可以进行更快的动态单分派，就像在某种专业化查找表中一样？我只是找不到一种访问和调用MethodInstances 的方法，就好像它们是函数一样。

当谈到改变调度或专业化时，我想到了invoke和@nospecialize。invoke看起来它可能会直接跳到指定的方法，但检查多个参数类型和专业化仍然必须进行。@nospecialize不会跳过调度过程的任何部分，只会导致不同的专业化。

编辑：一个带有注释的最小示例，希望能够描述我正在谈论的内容。

struct Foo end
struct Bar end

#   want to dispatch only on 1st argument
#          still want to specialize on 2nd argument
baz(::Foo, ::Integer) = 1
baz(::Foo, ::AbstractFloat) = 1.0
baz(::Bar, ::Integer) = 1im
baz(::Bar, ::AbstractFloat) = 1.0im

x = Any[Foo(), Bar(), Foo()]

# run test1(x, 1) or test1(x, 1.0)
function test1(x, second)
  #   first::Any in @code_warntype printout
  for first in x
    # first::Any requires dynamic dispatch of baz
    println(baz(first, second))
    # Is it possible to only dispatch -baz- on -first- given
    # the concrete types of the other arguments -second-?
  end
end

Run Code Online (Sandbox Code Playgroud)

Answer 1

cbk*_*cbk 0

执行您要求的操作的最简单方法是不分派第二个参数（通过不在第二个变量上指定足以触发分派的类型断言），而是专门使用if函数中的语句。例如：

\n

struct Foo end\nstruct Bar end\n\n# Note lack of type assertion on second variable. \n# We could also write `baz(::Foo, n::Number)` for same effect in this case, \n# but type annotations have no performance benefit in Julia if you're not \n# dispatching on them anyways.\nfunction baz(::Foo, n) \n    if isa(n, Integer)\n        1\n    elseif isa(n, AbstractFloat)\n        1.0\n    else\n        error("unsupported type")\n    end\nend\n\nfunction baz(::Bar, n)\n    if isa(n, Integer)\n        1im\n    elseif isa(n, AbstractFloat)\n        1.0im\n    else\n        error("unsupported type")\n    end\nend\n

Run Code Online (Sandbox Code Playgroud)\n

现在，这将做你想做的

\n

julia> x = Any[Foo(), Bar(), Foo()]\n3-element Vector{Any}:\n Foo()\n Bar()\n Foo()\n\njulia> test1(x, 1)\n1\n0 + 1im\n1\n\njulia> test1(x, 1.0)\n1.0\n0.0 + 1.0im\n1.0\n

Run Code Online (Sandbox Code Playgroud)\n

并且由于这有效地手动仅选择两种情况来专门处理所有可能的类型，因此我可以想象这种技术具有性能优势的场景（当然，在 Julia 中不言而喻，通常情况下更好如果可能的话，首先要找到并消除类型不稳定的根源）。

\n

然而，在这个问题的上下文中，至关重要的是要指出，即使我们已经消除了函数第二个参数的分派，如果第一个参数（即您的参数），这些baz函数的性能可能仍然很差。正在调度）是类型不稳定的 \xe2\x80\x93 ，就像所写问题中的情况一样，因为使用了.Array{Any}

\n
相反，尝试使用至少具有某种类型约束的数组。前任：
\n
julia> function test2(x, second)\n s = 1+1im\n for first in x\n s += baz(first, second)\n end\n s\n end\ntest2 (generic function with 1 method)\n\njulia> using BenchmarkTools\n\njulia> x = Any[Foo(), Bar(), Foo()];\n\njulia> @benchmark test2($x, 1)\nBenchmarkTools.Trial: 10000 samples with 998 evaluations.\n Range (min \xe2\x80\xa6 max): 13.845 ns \xe2\x80\xa6 71.554 ns \xe2\x94\x8a GC (min \xe2\x80\xa6 max): 0.00% \xe2\x80\xa6 0.00%\n Time (median): 13.869 ns \xe2\x94\x8a GC (median): 0.00%\n Time (mean \xc2\xb1 \xcf\x83): 15.397 ns \xc2\xb1 3.821 ns \xe2\x94\x8a GC (mean \xc2\xb1 \xcf\x83): 0.00% \xc2\xb1 0.00%\n\n \xe2\x96\x88\xe2\x96\x85 \xe2\x96\x83 \xe2\x96\x84 \xe2\x96\x84 \xe2\x96\x84 \xe2\x96\x84 \xe2\x96\x83 \xe2\x96\x81\n \xe2\x96\x88\xe2\x96\x88\xe2\x96\x87\xe2\x96\x86\xe2\x96\x88\xe2\x96\x87\xe2\x96\x88\xe2\x96\x88\xe2\x96\x84\xe2\x96\x88\xe2\x96\x87\xe2\x96\x87\xe2\x96\x84\xe2\x96\x83\xe2\x96\x81\xe2\x96\x81\xe2\x96\x88\xe2\x96\x88\xe2\x96\x81\xe2\x96\x83\xe2\x96\x83\xe2\x96\x81\xe2\x96\x81\xe2\x96\x83\xe2\x96\x88\xe2\x96\x88\xe2\x96\x83\xe2\x96\x81\xe2\x96\x83\xe2\x96\x81\xe2\x96\x81\xe2\x96\x84\xe2\x96\x83\xe2\x96\x83\xe2\x96\x83\xe2\x96\x86\xe2\x96\x86\xe2\x96\x85\xe2\x96\x86\xe2\x96\x86\xe2\x96\x85\xe2\x96\x85\xe2\x96\x84\xe2\x96\x81\xe2\x96\x81\xe2\x96\x84\xe2\x96\x83\xe2\x96\x83\xe2\x96\x83\xe2\x96\x81\xe2\x96\x83\xe2\x96\x81\xe2\x96\x84\xe2\x96\x81\xe2\x96\x81\xe2\x96\x83\xe2\x96\x84\xe2\x96\x84\xe2\x96\x88 \xe2\x96\x88\n 13.8 ns Histogram: log(frequency) by time 30.2 ns <\n\n Memory estimate: 0 bytes, allocs estimate: 0.\n\njulia> x = Union{Foo,Bar}[Foo(), Bar(), Foo()];\n\njulia> @benchmark test2($x, 1)\nBenchmarkTools.Trial: 10000 samples with 1000 evaluations.\n Range (min \xe2\x80\xa6 max): 4.654 ns \xe2\x80\xa6 62.311 ns \xe2\x94\x8a GC (min \xe2\x80\xa6 max): 0.00% \xe2\x80\xa6 0.00%\n Time (median): 4.707 ns \xe2\x94\x8a GC (median): 0.00%\n Time (mean \xc2\xb1 \xcf\x83): 5.471 ns \xc2\xb1 1.714 ns \xe2\x94\x8a GC (mean \xc2\xb1 \xcf\x83): 0.00% \xc2\xb1 0.00%\n\n \xe2\x96\x88\xe2\x96\x82\xe2\x96\x82\xe2\x96\x83\xe2\x96\x84 \xe2\x96\x83 \xe2\x96\x84\xe2\x96\x81 \xe2\x96\x84\xe2\x96\x82 \xe2\x96\x85\xe2\x96\x81 \xe2\x96\x81\xe2\x96\x84 \xe2\x96\x81\n \xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x81\xe2\x96\x81\xe2\x96\x88\xe2\x96\x88\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x88\xe2\x96\x88\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x88\xe2\x96\x88\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x84\xe2\x96\x81\xe2\x96\x83\xe2\x96\x81\xe2\x96\x83\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x83\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x83\xe2\x96\x81\xe2\x96\x83\xe2\x96\x83\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x83\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x81\xe2\x96\x88\xe2\x96\x88 \xe2\x96\x88\n 4.65 ns Histogram: log(frequency) by time 10.2 ns <\n\n Memory estimate: 0 bytes, allocs estimate: 0.\n
Run Code Online (Sandbox Code Playgroud)\n

归档时间：	4 年前
查看次数：	436 次
最近记录：	4 年前