快速逆任意幂根算法实现

Question

快速逆任意幂根算法实现

use*_*232 3 c++ optimization square-root

许多资料表明，众所周知的快速平方根反算法可以推广到计算任意幂的平方根。不幸的是，我还没有找到这样的C ++实现，而且我也不擅长通过数学自己概括这种方法。您能帮助我做到这一点，还是提供现成的解决方案？我认为这对许多人都是有用的，尤其是在有充分解释的情况下。

这是原始算法，例如，我不太了解需要更改的内容1 /cbrt(x)：

float Q_rsqrt( float number )
{
    long i;
    float x2, y;
    const float threehalfs = 1.5F;

    x2 = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y;                       // evil floating point bit level hacking
    i  = 0x5f3759df - ( i >> 1 );               // what the...? 
    y  = * ( float * ) &i;
    y  = y * ( threehalfs - ( x2 * y * y ) );   // 1st iteration
//  y  = y * ( threehalfs - ( x2 * y * y ) );   // 2nd iteration, this can be removed

    return y;
}

Run Code Online (Sandbox Code Playgroud)

Answer 1

Dmy*_*yka 7

该算法包括两个步骤-粗略的求解估计和使用几个Newton方法步骤的求解改进。

粗略估计

基本思想是使用浮点数对数log2(x)与其整数表示之间的关系Ix：

$I_ {x} = Llog_ {2}（x）+ L（B- \ sigma）; log_ {2}（x）\大约\ frac {I_ {x}} {L}-（B- \ sigma）$

（图片来自https://en.wikipedia.org/wiki/Fast_inverse_square_root）

现在使用众所周知的对数标识作为根：

$log_ {2}（\ frac {1} {\ sqrt [n] {x}}）=-\ frac {1} {n} log_ {2}（x）$ 。

结合先前获得的身份，我们得到：

$\ frac {I_ {y}} {L}-（B- \ sigma）=-\ frac {1} {c}（\ frac {I_ {x}} {L}-（B- \ sigma））$

$I_ {y} = \ frac {（n + 1）L（B- \ sigma）} {n}-\ frac {I_ {x}} {n}$

代的数值L * (B - s) = 0x3F7A3BEA，所以

Iy = 0x3F7A3BEA / c * (c + 1) - Ix / c;。

对于简单的浮点数表示形式为整数并返回，可以方便地使用uniontype：

   union 
   { 
      float f; // float representation
      uint32_t i; // integer representation
   } t;

   t.f = x;
   t.i = 0x3F7A3BEA / n * (n + 1) - t.i / n; // Work with integer representation
   float y = t.f; // back to float representation

Run Code Online (Sandbox Code Playgroud)

注意，对于n=2表达式，简化t.i = 0x5f3759df - t.i / 2;为与原始表达式相同i = 0x5f3759df - ( i >> 1 );

牛顿解决方案的改进

改变平等

变成应该解决的方程式：

$x = \ frac {1} {y ^ {n}}; f（y）= \ frac {1} {y ^ {n}}-x; f（y）= 0$

现在构造牛顿步骤：

${f（y）}'=-\ frac {n} {y ^ {n + 1}}$

$y_ {i + 1} = y_ {i} -f（y_ {i}）/ {f}'（y_ {i}）= y_ {i} + \ frac {（\ frac {1} {y ^ {n } _ {i}}-x）y ^ {n + 1} _ {i}} {n} = \ frac {y_ {i}（1 + n-xy_ {i} ^ {n}）} {n}$

以编程方式看起来像：y = y * (1 + n - x * pow(y,n)) / n;。首先y，我们使用在“ 粗略估计”步骤中获得的值。

注意对于平方根（n = 2）的特殊情况，我们得到的y = y * (3 - x*y*y) / 2;结果与原始公式相同y = y * (threehalfs - (x2 * y * y));

最终代码作为模板功能。参数N确定根功率。

template<unsigned N>
float power(float x) {
   if (N % 2 == 0) return power<N / 2>(x * x);
   else if (N % 3 == 0) return power<N / 3>(x * x * x);
   return power<N - 1>(x) * x;
};

template<>
float power<0>(float x){ return 1; }

// fast_inv_nth_root<2>(x) - inverse square root 1/sqrt(x)
// fast_inv_nth_root<3>(x) - inverse cube root 1/cbrt(x)

template <unsigned n>
float fast_inv_nth_root(float x)
{
   union { float f; uint32_t i; } t = { x };

   // Approximate solution
   t.i = 0x3F7A3BEA / n * (n + 1) - t.i / n;
   float y = t.f;

   // Newton's steps. Copy for more accuracy.
   y = y * (n + 1 - x * power<n>(y)) / n;
   y = y * (n + 1 - x * power<n>(y)) / n;
   return y;
}

Run Code Online (Sandbox Code Playgroud)

测试中

测试代码：

int main()
{
   std::cout << "|x          ""|fast2      "" actual2    "
      "|fast3      "" actual3    "
      "|fast4      "" actual4    "
      "|fast5      "" actual5    ""|" << std::endl;

   for (float i = 0.00001; i < 10000; i *= 10)
      std::cout << std::setprecision(5) << std::fixed
      << std::scientific << '|'
      << i << '|'
      << fast_inv_nth_root<2>(i) << " " << 1 / sqrt(i) << "|"
      << fast_inv_nth_root<3>(i) << " " << 1 / cbrt(i) << "|"
      << fast_inv_nth_root<4>(i) << " " << pow(i, -0.25) << "|"
      << fast_inv_nth_root<5>(i) << " " << pow(i, -0.2) << "|"
      << std::endl;
}

Run Code Online (Sandbox Code Playgroud)

结果：

|x          |fast2       actual2    |fast3       actual3    |fast4       actual4    |fast5       actual5    |
|1.00000e-05|3.16226e+02 3.16228e+02|4.64152e+01 4.64159e+01|1.77828e+01 1.77828e+01|9.99985e+00 1.00000e+01|
|1.00000e-04|9.99996e+01 1.00000e+02|2.15441e+01 2.15443e+01|9.99991e+00 1.00000e+01|6.30949e+00 6.30957e+00|
|1.00000e-03|3.16227e+01 3.16228e+01|1.00000e+01 1.00000e+01|5.62339e+00 5.62341e+00|3.98103e+00 3.98107e+00|
|1.00000e-02|9.99995e+00 1.00000e+01|4.64159e+00 4.64159e+00|3.16225e+00 3.16228e+00|2.51185e+00 2.51189e+00|
|1.00000e-01|3.16227e+00 3.16228e+00|2.15443e+00 2.15443e+00|1.77828e+00 1.77828e+00|1.58487e+00 1.58489e+00|
|1.00000e+00|9.99996e-01 1.00000e+00|9.99994e-01 1.00000e+00|9.99991e-01 1.00000e+00|9.99987e-01 1.00000e+00|
|1.00000e+01|3.16226e-01 3.16228e-01|4.64159e-01 4.64159e-01|5.62341e-01 5.62341e-01|6.30948e-01 6.30957e-01|
|1.00000e+02|9.99997e-02 1.00000e-01|2.15443e-01 2.15443e-01|3.16223e-01 3.16228e-01|3.98102e-01 3.98107e-01|
|1.00000e+03|3.16226e-02 3.16228e-02|1.00000e-01 1.00000e-01|1.77827e-01 1.77828e-01|2.51185e-01 2.51189e-01|
|1.00000e+04|9.99996e-03 1.00000e-02|4.64155e-02 4.64159e-02|9.99995e-02 1.00000e-01|1.58487e-01 1.58489e-01|

Run Code Online (Sandbox Code Playgroud)

@DmytroDadyka，你不能。mathexchange，stats等支持MathJax，但不支持stackoverflow-他们认为这对于很少使用的功能来说负担太大 (2认同)

归档时间：	6 年，8 月前
查看次数：	196 次
最近记录：	6 年，7 月前