高效的浮点比较(Cortex-A8)

Question

高效的浮点比较(Cortex-A8)

有一个很大的(~100 000)浮点变量数组,并且有一个阈值(也是浮点).

问题是我必须将数组中的每个变量与阈值进行比较,但NEON标志传输需要很长时间(根据分析器约20个周期).

有没有有效的方法来比较这些值？

注意:由于舍入错误无关紧要,我尝试了以下方法:

float arr[10000];
float threshold; 
....

int a = arr[20]; // e.g.
int t = threshold;
if (t > a) {....}

Run Code Online (Sandbox Code Playgroud)

但在这种情况下,我得到以下处理器命令序列:

vldr.32        s0, [r0]
vcvt.s32.f32   s0, s0
vmov           r0, s0    <--- takes 20 cycles as `vmrs APSR_nzcv, fpscr` in case of 
cmp            r0, r1         floating point comparison

Run Code Online (Sandbox Code Playgroud)

当转换发生在NEON时,无论我是通过描述的方式还是浮点数来比较整数.

Answer 1

Ale*_*nze 5

如果浮点数是32位IEEE-754并且整数也是32位且如果没有+无穷大,无穷大和NaN值,我们可以将浮点数作为整数与一个小技巧进行比较:

#include <stdio.h>
#include <limits.h>
#include <assert.h>

#define C_ASSERT(expr) extern char CAssertExtern[(expr)?1:-1]
C_ASSERT(sizeof(int) == sizeof(float));
C_ASSERT(sizeof(int) * CHAR_BIT == 32);

int isGreater(float* f1, float* f2)
{
  int i1, i2, t1, t2;

  i1 = *(int*)f1;
  i2 = *(int*)f2;

  t1 = i1 >> 31;
  i1 = (i1 ^ t1) + (t1 & 0x80000001);

  t2 = i2 >> 31;
  i2 = (i2 ^ t2) + (t2 & 0x80000001);

  return i1 > i2;
}

int main(void)
{
  float arr[9] = { -3, -2, -1.5, -1, 0, 1, 1.5, 2, 3 };
  float thr;
  int i;

  // Make sure floats are 32-bit IEE754 and
  // reinterpreted as integers as we want/expect
  {
    static const float testf = 8873283.0f;
    unsigned testi = *(unsigned*)&testf;
    assert(testi == 0x4B076543);
  }

  thr = -1.5;
  for (i = 0; i < 9; i++)
  {
    printf("%f %s %f\n", arr[i], "<=\0> " + 3*isGreater(&arr[i], &thr), thr);
  }

  thr = 1.5;
  for (i = 0; i < 9; i++)
  {
    printf("%f %s %f\n", arr[i], "<=\0> " + 3*isGreater(&arr[i], &thr), thr);
  }

  return 0;
}

Run Code Online (Sandbox Code Playgroud)

输出:

-3.000000 <= -1.500000
-2.000000 <= -1.500000
-1.500000 <= -1.500000
-1.000000 >  -1.500000
0.000000 >  -1.500000
1.000000 >  -1.500000
1.500000 >  -1.500000
2.000000 >  -1.500000
3.000000 >  -1.500000
-3.000000 <= 1.500000
-2.000000 <= 1.500000
-1.500000 <= 1.500000
-1.000000 <= 1.500000
0.000000 <= 1.500000
1.000000 <= 1.500000
1.500000 <= 1.500000
2.000000 >  1.500000
3.000000 >  1.500000

Run Code Online (Sandbox Code Playgroud)

当然,isGreater()如果阈值没有改变,那么预先计算比较运算符中使用的最终整数值是有意义的.

如果你害怕上面代码中的C/C++中的未定义行为,你可以在程序集中重写代码.

归档时间：	13 年，7 月前
查看次数：	2085 次
最近记录：	13 年，7 月前