如何强制编译器生成手动切换的等效代码?

Ale*_*zov 18 c++ gcc clang c++11 c++14

任务很简单:我们有一个类型列表(即using list = std::tuple<A, B, C, ...>;),我们希望根据运行时已知的索引调度其内容.通过"dispatch"我的意思是调用一个模板化的处理程序,由该列表中的类型参数化.switch手动编写很容易:

template <class... Args>
auto dispatch(size_t i, Args&& ...args)
{
  switch (i) {
    case 0: return handler<typename std::tuple_element<0, list>::type>(std::forward<Args>(args)...);
    ...
  }
}
Run Code Online (Sandbox Code Playgroud)

但这很乏味,容易出错,而且很无聊.我正在寻找一种方法来强制编译器从更紧凑和简洁的定义中做到这一点.

我想要获得与手动开关相同或非常接近的结果.因此,生成函数指针表(如此)并不是一个理想的解决方案(我不想为处理程序完全无法使用的情况引入函数调用).

到目前为止,我发现了两种方法:

  1. 无法递归(感谢Horstling)

    template <size_t N>
    struct dispatch_helper
    {
        template <class... Args>
        static R dispatch(size_t i, Args&& ...args)
        {
            if (N == i)
                return handler<typename std::tuple_element<N, list>::type>(std::forward<Args>(args)...);
            return dispatch_helper<N+1>::dispatch(i, std::forward<Args>(args)...);
        }
    };
    
    template <>
    struct dispatch_helper<200>
    {
        template <class... Args>
        static R dispatch(size_t type, Args&& ...args)
        {
            return {};
        }
    };
    
    template <class... Args>
    auto dispatch_recursion(size_t i, Args&& ...args)
    {
        return dispatch_helper<0>::dispatch(i, std::forward<Args>(args)...);
    }
    
    Run Code Online (Sandbox Code Playgroud)
  2. 用法 std::initializer_list

    template <class L>
    struct Iterator;
    
    template <template <class...> class L, class... T>
    struct Iterator<L<T...>>
    {
        template <class... Args>
        static R dispatch(size_t i, Args&& ...args)
        {
            R res;
            std::initializer_list<int> l {
                (
                    i == T::value ? (res = handler<typename T::type>(std::forward<Args>(args)...), 0) : 0
                )...
            };
            (void)l;
            return res;
        }
    };
    
    template <class... Args>
    auto dispatch_type_list_iter(size_t i, Args&& ...args)
    {
        return Iterator<index_list<list>>::dispatch(i, std::forward<Args>(args)...);
    }
    
    Run Code Online (Sandbox Code Playgroud)

    我在这里省略了一些细节,比如将类型列表转换为索引类型列表或如何处理返回类型,但我希望这个想法很明确.

我尝试使用godbolt和Clang 4.0为第一种方法生成一个"对数"开关(一个小开关树)和一个代码,与第二种方法的手动开关相同.我使用了处理程序的大小,无法使用或无法使用的处理程序,结果看起来很稳定.

但GCC和ICC都产生了一个简单的条件序列(对于这两种方法),这非常令人难过.

那么,还有其他解决方案吗?特别是那些至少在Clang GCC工作的人.

Out*_*und 3

您可以自己进行二分搜索:

#include <tuple>
#include <iostream>

template <class T>
void function(T arg) {
    std::cout << arg << std::endl;
}
Run Code Online (Sandbox Code Playgroud)

如果您能够使用 c++14 并且可以使用 auto 进行返回类型推导,使用 auto 进行泛型 lambda 表达式,您可以更进一步:

#include <tuple>
#include <iostream>
#include <string>

template <class Function, size_t begin, size_t end, class Tuple>
struct  call_on_element;

template <class Function, size_t begin, class Tuple>
struct call_on_element<Function, begin, begin, Tuple> {
    static auto call(Function f, size_t, const Tuple& t) {
        return f(std::get<begin>(t));
    }
};

template <class Function, size_t begin, size_t end, class Tuple>
struct call_on_element {
    static auto call(Function f, size_t i, const Tuple& t) {
        constexpr size_t half = begin + (end - begin) / 2;
        if(i <= half) {
            return call_on_element<Function, begin, half, Tuple>::call(f, i, t);
        } else {
            return call_on_element<Function, half + 1, end, Tuple>::call(f, i, t);
        }
    }
};

template <class Function, class... Args>
auto dispatch(Function f, size_t i, Args&&... args) {
    const auto tuple = std::make_tuple(std::forward<decltype(args)>(args)...);
    return call_on_element<Function, 0, sizeof...(Args) - 1, decltype(tuple)>::call(f, i , tuple);
}


struct Functor {
    template <class T>
    std::string operator()(T arg) {
        std::cout << arg << std::endl;
        return "executed functor";
    }
};

int main() {

    int i = 42;
    char c = 'y';
    float f = 1337;

    std::cout << "using Functor" << std::endl;
    dispatch(Functor(), 0, i, c, f);
    dispatch(Functor(), 1, i, c, f);
    auto s = dispatch(Functor(), 2, i, c, f);

    std::cout << "return value of dispatch = " << s << std::endl;

    std::cout << "using lambda" << std::endl;
    dispatch([](auto arg) { std::cout << arg << std::endl;}, 0, i, c, f);
    dispatch([](auto arg) { std::cout << arg << std::endl;}, 1, i, c, f);
    dispatch([](auto arg) { std::cout << arg << std::endl;}, 2, i, c, f);

};
Run Code Online (Sandbox Code Playgroud)

我想这对于编译器来说更难内联,但使用起来更舒适,而且也不是不可能。


为了检查内联,我将 c++14 版本修改为:

int main() {
    dispatch(1, 42, 'c', 3.14);
};

// compiled with 'g++ -g -O3 -std=c++14 -S main.cpp' (using gcc7.1.1)
// generated objectdump with objdump -d a.out | c++filt
Run Code Online (Sandbox Code Playgroud)

Objdump 提供以下输出:

00000000000009a0 <main>:
 9a0:   55                      push   %rbp
 9a1:   53                      push   %rbx
 9a2:   48 8d 3d d7 16 20 00    lea    0x2016d7(%rip),%rdi        # 202080 <std::cout@@GLIBCXX_3.4>
 9a9:   ba 01 00 00 00          mov    $0x1,%edx
 9ae:   48 83 ec 18             sub    $0x18,%rsp
 9b2:   48 8d 74 24 07          lea    0x7(%rsp),%rsi
 9b7:   c6 44 24 07 63          movb   $0x63,0x7(%rsp)
 9bc:   64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
 9c3:   00 00 
 9c5:   48 89 44 24 08          mov    %rax,0x8(%rsp)
 9ca:   31 c0                   xor    %eax,%eax
 9cc:   e8 7f ff ff ff          callq  950 <std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@plt>
 9d1:   48 89 c5                mov    %rax,%rbp
 9d4:   48 8b 00                mov    (%rax),%rax
 9d7:   48 8b 40 e8             mov    -0x18(%rax),%rax
 9db:   48 8b 9c 05 f0 00 00    mov    0xf0(%rbp,%rax,1),%rbx
 9e2:   00 
 9e3:   48 85 db                test   %rbx,%rbx
 9e6:   74 67                   je     a4f <main+0xaf>
 9e8:   80 7b 38 00             cmpb   $0x0,0x38(%rbx)
 9ec:   74 2d                   je     a1b <main+0x7b>
 9ee:   0f be 73 43             movsbl 0x43(%rbx),%esi
 9f2:   48 89 ef                mov    %rbp,%rdi
 9f5:   e8 86 ff ff ff          callq  980 <std::basic_ostream<char, std::char_traits<char> >::put(char)@plt>
 9fa:   48 89 c7                mov    %rax,%rdi
 9fd:   e8 5e ff ff ff          callq  960 <std::basic_ostream<char, std::char_traits<char> >::flush()@plt>
 a02:   31 c0                   xor    %eax,%eax
 a04:   48 8b 4c 24 08          mov    0x8(%rsp),%rcx
 a09:   64 48 33 0c 25 28 00    xor    %fs:0x28,%rcx
 a10:   00 00 
 a12:   75 36                   jne    a4a <main+0xaa>
 a14:   48 83 c4 18             add    $0x18,%rsp
 a18:   5b                      pop    %rbx
 a19:   5d                      pop    %rbp
 a1a:   c3                      retq   
 a1b:   48 89 df                mov    %rbx,%rdi
 a1e:   e8 fd fe ff ff          callq  920 <std::ctype<char>::_M_widen_init() const@plt>
 a23:   48 8b 03                mov    (%rbx),%rax
 a26:   48 8d 0d 73 01 00 00    lea    0x173(%rip),%rcx        # ba0 <std::ctype<char>::do_widen(char) const>
 a2d:   be 0a 00 00 00          mov    $0xa,%esi
 a32:   48 8b 50 30             mov    0x30(%rax),%rdx
 a36:   48 39 ca                cmp    %rcx,%rdx
 a39:   74 b7                   je     9f2 <main+0x52>
 a3b:   be 0a 00 00 00          mov    $0xa,%esi
 a40:   48 89 df                mov    %rbx,%rdi
 a43:   ff d2                   callq  *%rdx
 a45:   0f be f0                movsbl %al,%esi
 a48:   eb a8                   jmp    9f2 <main+0x52>
 a4a:   e8 21 ff ff ff          callq  970 <__stack_chk_fail@plt>
 a4f:   e8 bc fe ff ff          callq  910 <std::__throw_bad_cast()@plt>
 a54:   66 90                   xchg   %ax,%ax
 a56:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
 a5d:   00 00 00 

template <size_t begin, size_t end, class Tuple>
struct  call_on_element;

template <size_t begin, class Tuple>
struct call_on_element<begin, begin, Tuple> {
    static void call(size_t, const Tuple& t) {
        function(std::get<begin>(t));
    }
};

template <size_t begin, size_t end, class Tuple>
struct call_on_element {
    static void call(size_t i, const Tuple& t) {
        constexpr size_t half = begin + (end - begin) / 2;
        if(i <= half) {
            call_on_element<begin, half, Tuple>::call(i, t);
        } else {
            call_on_element<half+1, end, Tuple>::call(i, t);
        }
    }
};

template <class... Args>
void dispatch(size_t i, Args&&... args) {
    const auto tuple = std::make_tuple(std::forward<decltype(args)>(args)...);
    call_on_element<0, sizeof...(Args)-1, decltype(tuple)>::call(i , tuple);
}

int main() {

    int i = 42;
    char c = 'y';
    float f = 1337;

    dispatch(0, i, c, f);
    dispatch(1, i, c, f);
    dispatch(2, i, c, f);

};
Run Code Online (Sandbox Code Playgroud)

callq std::basic_ostream<...>对和 的调用je ...让我相信,它实际上内联了函数并执行对数条件树。