C中有符号整数中的“异常”

Jak*_*LEE 5 c optimization integer arm bit-manipulation

我目前正在写有关ARM优化的演讲,特别是在矢量机(例如NEON)上作为最终目标。

而且由于向量机与if-else激流回旋系统的配合不好,因此我试图演示如何通过位黑客摆脱它们。

我以“绝对饱和”功能为例。这实际上是一个ABS例程,具有将结果限制为0x7fffffff的附加功能。

可能的最大负32位数字为0x80000000,这是非常危险的事情,因为val = -val;返回的初始值与0x80000000相同,这是由二进制补码系统(尤其是DSP操作)中的不对称性引起的,因此必须将其大部分过滤掉通过“饱和”。

int32_t satAbs1(int32_t val)
{
    if (val < 0) val = -val;
    if (val < 0) val = 0x7fffffff;
    return val;
}
Run Code Online (Sandbox Code Playgroud)

以下是我将在汇编中编写的内容:

cmp     r0, #0
rsblts  r0, r0, #0
mvnlt   r0, #0x80000000
bx      lr
Run Code Online (Sandbox Code Playgroud)

下面是我实际上从上面的C代码中获得的东西:

satAbs1
        0x00000000:    CMP      r0,#0
        0x00000004:    RSBLT    r0,r0,#0
        0x00000008:    BX       lr
Run Code Online (Sandbox Code Playgroud)

什么 编译器只是完全丢弃了饱和部分!

编译器似乎排除了val在第一条if语句之后为负的情况,如果它为0x80000000,则为true

还是函数应该返回无符号值?

uint32_t satAbs2(int32_t val)
{
    uint32_t result;
    if (val < 0) result = (uint32_t) -val; else result = (uint32_t) val;
    if (result == 0x80000000) result = 0x7fffffff;
    return result;
}

satAbs2
        0x0000000C:    CMP      r0,#0
        0x00000010:    RSBLT    r0,r0,#0
        0x00000014:    BX       lr
Run Code Online (Sandbox Code Playgroud)

不幸的是,它生成与签名版本完全相同的机器代码:没有饱和。

同样,编译器似乎排除了val0x80000000 的情况

好的,让我们扩大第二个if语句的范围:

uint32_t satAbs3(int32_t val)
{
    uint32_t result;
    if (val < 0) result = (uint32_t) -val; else result = (uint32_t) val;
    if (result >= 0x80000000) result = 0x7fffffff;
    return result;
}

satAbs3
        0x00000018:    CMP      r0,#0
        0x0000001C:    RSBLT    r0,r0,#0
        0x00000020:    CMP      r0,#0
        0x00000024:    MVNLT    r0,#0x80000000
        0x00000028:    BX       lr
Run Code Online (Sandbox Code Playgroud)

最后,尽管看起来是最优的,编译器似乎正在完成它的工作(CMP与汇编版本相比是不必要的)

我可以接受次优的编译器,但令我困扰的是它们排除了不应该使用的内容:0x80000000

我什至就此向GCC开发人员提交了错误报告,但我发现这Clang还排除了整数为0x80000000的情况,因此我想我缺少与C标准有关的内容。

谁能告诉我我在哪里弄错了?

顺便说一句,下面是if-less比特黑客版本的样子:

int32_t satAbs_bh(int32_t val)
{
    int32_t temp = val ^ (val>>31);
    val = temp + (val>>31);
    val ^= val>>31;
    return val;
}

satAbs_bh
        0x0000002C:    EOR      r3,r0,r0,ASR #31
        0x00000030:    ADD      r0,r3,r0,ASR #31
        0x00000034:    EOR      r0,r0,r0,ASR #31
        0x00000038:    BX       lr
Run Code Online (Sandbox Code Playgroud)

编辑:我同意我的这个问题在某种程度上是重复的。
但是,它更加全面,包括一些组装级的东西和位掩码技术,这些技术与所提到的相比可能会有所帮助。

下面提供了一个解决此问题的方法,而无需破坏编译器选项;预先排除整数溢出的可能性:

int32_t satAbs4(int32_t val)
{
    if (val == 0x80000000) return 0x7fffffff;
    if (val < 0) val = -val;
    return val;
}
satAbs4
        0x0000002C:    CMP      r0,#0x80000000
        0x00000030:    BEQ      {pc}+0x10 ; 0x40
        0x00000034:    CMP      r0,#0
        0x00000038:    RSBLT    r0,r0,#0
        0x0000003C:    BX       lr
        0x00000040:    MVN      r0,#0x80000000
        0x00000044:    BX       lr
Run Code Online (Sandbox Code Playgroud)

同样,linaro GCC 7.4.1我正在使用的程序演示了它的缺点:我不理解第BEQ2行,moveq r0, #0x80000001如源代码中所建议的那样,可能在末尾保存了两条指令。

Gro*_*roo 8

Signed integer overflow or underflow is undefined behavior in C, meaning that you are expected to handle these edge cases yourself. In other words, as soon as the compiler has established that a certain signed integer value is positive, it doesn't care if there is a possibility that it could turn negative through UB.

For example, this code:

int test(int input)
{
    if (input > 0)
        input += 100;

    if (input > 0)
        input += 100;

    if (input > 0)
        input += 100;

    return input;
}
Run Code Online (Sandbox Code Playgroud)

can legally be optimized to this:

int test(int input)
{
    if (input > 0)
        input += 300;

    return input;
}
Run Code Online (Sandbox Code Playgroud)

even though the author of the initial code might have expected that input could overflow between each successive statements.

That's why an optimizing compiler sees your code as something like this:

int32_t satAbs1(int32_t val)
{
    if (val < 0) val = -val;

    // val must be positive here,
    // unless you are relying on UB

    // the following condition is 
    // therefore always false:
    // if (val < 0) val = 0x7fffffff;

    return val;
}
Run Code Online (Sandbox Code Playgroud)

So, the only way to avoid UB is to avoid negating the signed integer if there is a chance that it might invoke UB, i.e.:

int32_t satAbs3_simple(int32_t val)
{
    if (val >= 0) 
        return val;

    // we know that val is negative here, 
    // but unfortunately gcc knows it as well,
    // so we'll handle the edge case explicitly

    if (val == INT32_MIN)
        return INT32_MAX;

    return -val;
}
Run Code Online (Sandbox Code Playgroud)

gcc with -O2 produces code with a branch (early conditional return at bxge):

satAbs3_basic:
  cmp r0, #0
  bxge lr // return r0 if ge #0
  cmp r0, #0x80000000
  rsbne r0, r0, #0
  moveq r0, #0x7FFFFFFF
  bx lr
Run Code Online (Sandbox Code Playgroud)

As @rici mentioned in the comments, if exact-width signed int types from stdint.h (intN_t) are available on your compiler, this means they have to be represented with N bits, no padding, using 2's complement.

This means that you can rewrite the code slightly to use bit masks, which might provide a slightly shorter assembly output (at least with gcc 5 or newer), still without branching:

int32_t satAbs3_c(int32_t val)
{
    uint32_t result = (uint32_t)val;
    if (result & 0x80000000) result = -result; // <-- avoid UB here by negating uint32_t
    if (result == 0x80000000) result = 0x7FFFFFFF;
    return (int32_t)result;
}
Run Code Online (Sandbox Code Playgroud)

Note that an optimizing compiler should theoretically be able to produce this same output for both cases, but anyway, recent gcc versions (with -O1) for the last snippet give:

satAbs3_c:
  cmp r0, #0
  rsblt r0, r0, #0
  cmp r0, #0x80000000
  moveq r0, #0x7FFFFFFF
  bx lr
Run Code Online (Sandbox Code Playgroud)

我实际上相信它不能比这短(除了xor比特破解),因为您的初始程序集之后似乎缺少cmp r0, #0指令rsblts(因为rsbltschange r0,并且cmp是进行实际比较的部分)。