将uint32的向量转换为float向量的最有效方法是什么？

Question

将uint32的向量转换为float向量的最有效方法是什么？

zr.*_*zr. 5 floating-point x86 assembly sse

x86没有从无符号 int32 转换为浮点的SSE指令.实现这一目标的最有效指令序列是什么？

编辑:为了澄清,我想做以下标量操作的向量序列:

unsigned int x = ...
float res = (float)x;

Run Code Online (Sandbox Code Playgroud)

EDIT2:这是一个用于进行标量转换的简单算法.

unsigned int x = ...
float bias = 0.f;
if (x > 0x7fffffff) {
    bias = (float)0x80000000;
    x -= 0x80000000;
}
res = signed_convert(x) + bias;

Run Code Online (Sandbox Code Playgroud)

Answer 1

Ste*_*non 4

你的幼稚标量算法无法提供正确舍入的转换 - 它将遭受某些输入的双重舍入。举个例子：如果xis 0x88000081，那么转换为 float 的正确舍入结果是2281701632.0f，但您的标量算法将返回2281701376.0f。

在我的脑海中，您可以按如下方式进行正确的转换（正如我所说，这是我的头脑中的，所以很可能在某处保存指令）：

movdqa   xmm1,  xmm0    // make a copy of x
psrld    xmm0,  16      // high 16 bits of x
pand     xmm1, [mask]   // low 16 bits of x
orps     xmm0, [onep39] // float(2^39 + high 16 bits of x)
cvtdq2ps xmm1, xmm1     // float(low 16 bits of x)
subps    xmm0, [onep39] // float(high 16 bits of x)
addps    xmm0,  xmm1    // float(x)

Run Code Online (Sandbox Code Playgroud)

其中常数具有以下值：

mask:   0000ffff 0000ffff 0000ffff 0000ffff
onep39: 53000000 53000000 53000000 53000000

Run Code Online (Sandbox Code Playgroud)

其作用是将每个通道的高半部分和低半部分分别转换为浮点数，然后将这些转换后的值相加。由于每一半只有 16 位宽，因此转换为浮点型不会产生任何舍入。仅当两半相加时才会进行四舍五入；因为加法是正确舍入的运算，所以整个转换都是正确舍入的。

相比之下，您的简单实现首先将低 31 位转换为浮点数，这会导致舍入，然后有条件地将 2^31 添加到该结果，这可能会导致第二次舍入。每当您在转换中有两个单独的舍入点时，除非您非常小心它们是如何发生的，否则您不应期望结果能够正确舍入。

归档时间：	13 年，12 月前
查看次数：	1160 次
最近记录：	7 年，10 月前