为什么C编译器会优化开关，如果有所不同

Question

为什么C编译器会优化开关，如果有所不同

Lam*_*eta 9 c optimization gcc bit-manipulation disassembly

当我偶然发现一个奇怪的问题时，我正在从事一个个人项目。

在一个非常紧密的循环中，我有一个整数，其值在0到15之间。对于值0、1、8和9，我需要得到-1，对于值4、5、12和13，我需要得到1。

我转向godbolt检查一些选项，并惊讶于编译器似乎无法以与if链相同的方式优化switch语句。

代码是：

const int lookup[16] = {-1, -1, 0, 0, 1, 1, 0, 0, -1, -1, 0, 0, 1, 1, 0, 0};

int a(int num) {
    return lookup[num & 0xF];
}

int b(int num) {
    num &= 0xF;

    if (num == 0 || num == 1 || num == 8 || num == 9) 
        return -1;

    if (num == 4 || num == 5 || num == 12 || num == 13)
        return 1;

    return 0;
}

int c(int num) {
    num &= 0xF;
    switch (num) {
        case 0: case 1: case 8: case 9: 
            return -1;
        case 4: case 5: case 12: case 13:
            return 1;
        default:
            return 0;
    }
}

Run Code Online (Sandbox Code Playgroud)

我本以为b和c会产生相同的结果，并且我希望我自己可以阅读比特技巧，以提出一种有效的实现，因为我的解决方案（switch语句-另一种形式）相当慢。

奇怪的是，b编译为位hack的同时c几乎没有进行优化，或者简化为a依赖目标硬件的情况。

Can anybody explain why there is this discrepancy? What is the 'correct' way to optimize this query?

EDIT:

Clarification

I want the switch solution to be the fastest, or a similarly "clean" solution. However when compiled with optimizations on my machine the if solution is significantly faster.

I wrote a quick program to demonstrate and TIO has the same results as I find locally: Try it online!

With static inline the lookup table speeds up a bit: Try it online!

Answer 1

Ala*_*got 6

如果您明确列举所有情况，那么gcc非常有效：

int c(int num) {
    num &= 0xF;
    switch (num) {
        case 0: case 1: case 8: case 9: 
            return -1;
        case 4: case 5: case 12: case 13:
            return 1;
            case 2: case 3: case 6: case 7: case 10: case 11: case 14: case 15: 
        //default:
            return 0;
    }
}

Run Code Online (Sandbox Code Playgroud)

只是在一个简单的索引分支中编译的：

c:
        and     edi, 15
        jmp     [QWORD PTR .L10[0+rdi*8]]
.L10:
        .quad   .L12
        .quad   .L12
        .quad   .L9
        .quad   .L9
        .quad   .L11
        .quad   .L11
        .quad   .L9
        .quad   .L9
        .quad   .L12
etc...

Run Code Online (Sandbox Code Playgroud)

请注意，如果default:未注释，则gcc将返回其嵌套分支版本。

归档时间：	6 年，1 月前
查看次数：	165 次
最近记录：	6 年，1 月前