没有未定义的行为[c ++],哪些浮点值无法转换为int?

ric*_*cab 7 c++ type-conversion implicit-conversion c++14

我刚刚从C++ 14标准(我的重点)中读到了这个:

4.9浮动积分转换[conv.fpint]

1浮点类型的prvalue可以转换为整数类型的prvalue.转换截断; 也就是说,丢弃小数部分.如果截断的值无法在目标类型中表示,则行为未定义. [...]

这让我思考

  1. 哪个(如果有的话)floatint在截断后无法表示?(这取决于实施吗?)
  2. 如果有,这是否意味着auto x = static_cast<int>(float)不安全?
  3. 什么是转换的适当/安全的方式floatint然后(假设你要截断)?

Mik*_*ine 7

我们回过头来看了一会儿,我手动制作了一些表格,这些表格在各种转换的边缘具有各种大小的整数的浮点数.注意,这假设iee754 4字节floats和8字节doubles以及2的补码有符号整数(int32_t4字节和int64_t8字节).

如果你需要将位模式转换为浮点数或双精度数,你需要输入它们(技术上是UB)或memcpy它们.

并且为了回答你的问题,任何太大而不适合目标整数的东西都是转换时的UB,而截断到零的唯一时间是double- > int32_t.因此,使用以下值,您可以将浮点数与相关的最小值/最大值进行比较,并且只有在它们在范围内时才进行投射.

请注意,使用INT_MIN/ INT_MAX(或它们的现代限制对应物)进行交叉转换然后比较并不总是有效,因为这些大小值的浮点数的精度非常低.

Inf/NaN在转换时也是UB.

// float->int64 edgecases
static const uint32_t FloatbitsMaxFitInt64 = 0x5effffff; // [9223371487098961920] Largest float which still fits int an signed int64
static const uint32_t FloatbitsMinNofitInt64 = 0x5f000000; // [9223372036854775808] the bit pattern of the smallest float which is too big for a signed int64
static const uint32_t FloatbitsMinFitInt64 = 0xdf000000; // [-9223372036854775808] Smallest float which still fits int an signed int64
static const uint32_t FloatbitsMaxNotfitInt64 = 0xdf000001; // [-9223373136366403584] Largest float which to small for a signed int64

// float->int32 edgecases
static const uint32_t FloatbitsMaxFitInt32 = 0x4effffff; // [2147483520] the bit pattern of the largest float which still fits int an signed int32
static const uint32_t FloatbitsMinNofitInt32 = 0x4f000000; // [2147483648] the bit pattern of the smallest float which is too big for a signed int32
static const uint32_t FloatbitsMinFitInt32 = 0xcf000000; // [-2147483648] the bit pattern of the smallest float which still fits int an signed int32
static const uint32_t FloatbitsMaxNotfitInt32 = 0xcf000001; // [-2147483904] the bit pattern of the largest float which to small for a signed int32

// double->int64 edgecases
static const uint64_t DoubleBitsMaxFitInt64 = 0x43dfffffffffffff; // [9223372036854774784] Largest double which fits into an int64
static const uint64_t DoubleBitsMinNofitInt64 = 0x43e0000000000000; // [9223372036854775808] Smallest double which is too big for an int64
static const uint64_t DoubleBitsMinFitInt64 = 0xc3e0000000000000; // [-9223372036854775808] Smallest double which fits into an int64
static const uint64_t DoubleBitsMaxNotfitInt64 = 0xc3e0000000000001; // [-9223372036854777856] largest double which is too small to fit into an int64

// double->int32 edgecases[when truncating(round towards zero)]
static const uint64_t DoubleBitsMaxTruncFitInt32 = 0x41dfffffffffffff; // [~2147483647.9999998] Largest double that when truncated will fit into an int32
static const uint64_t DoubleBitsMinTruncNofitInt32 = 0x41e0000000000000; // [2147483648.0000000] Smallest double that when truncated wont fit into an int32
static const uint64_t DoubleBitsMinTruncFitInt32 = 0xc1e00000001fffff; // [~2147483648.9999995] Smallest double that when truncated will fit into an int32
static const uint64_t DoubleBitsMaxTruncNofitInt32 = 0xc1e0000000200000; // [2147483649.0000000] Largest double that when truncated wont fit into an int32

// double->int32 edgecases [when rounding via bankers method(round to nearest, round to even on half)]
static const uint64_t DoubleBitsMaxRoundFitInt32 = 0x41dfffffffdfffff; // [2147483647.5000000] Largest double that when rounded will fit into an int32
static const uint64_t DoubleBitsMinRoundNofitInt32 = 0x41dfffffffe00000; // [~2147483647.5000002] Smallest double that when rounded wont fit into an int32
static const uint64_t DoubleBitsMinRoundFitInt32 = 0xc1e0000000100000; // [-2147483648.5000000] Smallest double that when rounded will fit into an int32
static const uint64_t DoubleBitsMaxRoundNofitInt32 = 0xc1e0000000100001; // [~2147483648.5000005] Largest double that when rounded wont fit into an int32
Run Code Online (Sandbox Code Playgroud)

所以你想要的例子:

if( f >= B2F(FloatbitsMinFitInt32) && f <= B2F(FloatbitsMaxFitInt32))
    // cast is valid.
Run Code Online (Sandbox Code Playgroud)

B2F的地方如下:

float B2F(uint32_t bits)
{
    static_assert(sizeof(float) == sizeof(uint32_t), "Weird arch");
    float f;
    memcpy(&f, &bits, sizeof(float));
    return f;
}
Run Code Online (Sandbox Code Playgroud)

请注意,此转换正确地获取nans/inf(与它们的比较为false),除非您使用编译器的非iee754模式(例如gcc上的ffast-math或msvc上的/ fp:fast)


ana*_*lyg 4

float值超出范围一点也不奇怪int。浮点值的发明是为了充分表示非常大(也非常小)的值。

  1. INT_MAX + 1(通常等于2147483648)不能用 表示int,但可以用 表示float
  2. 是的,static_cast<int>(float)与未定义的行为一样不安全。x + y然而,像足够大的整数一样简单x,并且y也是 UB,所以这里也没有什么大的惊喜。
  3. 正确的处理方法取决于应用程序,就像在 C++ 中一样。Boost 会numeric_cast在溢出时抛出异常;这可能对你有好处。要进行饱和(将太大的值转换为INT_MININT_MAX),请编写如下代码

    float f;
    int i;
    ...
    if (static_cast<double>(INT_MIN) <= f && f < static_cast<double>(INT_MAX))
        i = static_cast<int>(f);
    else if (f < 0)
        i = INT_MIN;
    else
        i = INT_MAX;
    
    Run Code Online (Sandbox Code Playgroud)

    然而,这并不理想。您的系统是否具有double可以表示 的最大值的类型int?如果是的话,它会起作用。另外,您到底想如何舍入接近最小值或最大值的值int?如果您不想考虑此类问题,请使用boost::numeric_cast,如此处所述