使用标准C数学库实现sinpi（）和cospi（）

Question

使用标准C数学库实现sinpi（）和cospi（）

nju*_*ffa 11 c floating-point trigonometry math.h

函数sinpi(x)计算sin（？x），函数cospi(x)计算cos（？x），其中乘以？在函数内部是隐式的。这些功能最初是由Sun Microsystems在1980年代末期扩展为C标准数学库的。IEEE标准754™-2008指定的等价的功能sinPi和cosPi在第9。

有许多计算会自然产生sin（？x）和cos（？x）。一个非常简单的例子是Box-Muller变换（GEP Box和Mervin E. Muller，“关于随机正态偏差的生成的注释”。《数学统计》，第29卷，第2期，第610-611 页）），则给定两个独立的随机变量U？和你？分布均匀，产生独立的随机变量Z？和Z？标准正态分布：

Z? = ?(-2 ln U?) cos (2 ? U?)
Z? = ?(-2 ln U?) sin (2 ? U?)

Run Code Online (Sandbox Code Playgroud)

另一个示例是度数参数的正弦和余弦计算，如使用Haversine公式的大圆距离计算：

Z? = ?(-2 ln U?) cos (2 ? U?)
Z? = ?(-2 ln U?) sin (2 ? U?)

Run Code Online (Sandbox Code Playgroud)

对于C ++，Boost库提供sin_pi和 cos_pi，某些供应商提供sinpi和cospi功能作为系统库中的扩展。例如，苹果增加__sinpi，__cospi和相应的单精度版本__sinpif，__cospif到iOS 7和OS X 10.9（介绍，幻灯片101）。但是对于许多其他平台，没有C程序可以轻易访问的实现。

用传统的方法使用，例如比较sin (M_PI * x)和cos (M_PI * x)，使用sinpi和cospi经由减少舍入误差提高了精度，内部乘用？，同时还具有性能优势由于更简单的说法减少。

一个人如何可以使用标准C数学库来实现sinpi()和cospi()功能在一个合理的高效和符合标准的时尚？

Answer 1

nju*_*ffa 9

为简单起见，我将重点介绍sincospi()，它同时提供正弦和余弦结果。sinpi并且cospi可以构造为包装器功能，以丢弃不需要的数据。在许多应用程序中，fenv.h不需要处理浮点标志（请参阅参考资料），而且在errno大多数情况下我们也不需要错误报告，因此我将省略它们。

基本算法结构很简单。由于很大的参数总是偶数，因此2的倍数，它们的正弦和余弦值是众所周知的。在记录象限信息时，其他自变量会折叠到[-¼，+¼]范围内。多项式极小极大值逼近用于计算主逼近间隔上的正弦和余弦。最后，象限数据用于通过周期性交换结果和符号变化将初步结果映射到最终结果。

特殊操作数（特别是-0，infinities和NaN）的正确处理要求编译器仅应用符合IEEE-754规则的优化。它可能不会转化x*0.0成0.0（这不是正确-0，无穷大和NaN）也不可以优化0.0-x成-x为否定是根据IEEE-754的节5.5.1位级操作（产生对于零和NaN不同的结果）。大多数编译器会提供一个标志，用于强制使用“安全”转换，例如-fp-model=precise，对于Intel C / C ++编译器。

另外一个警告适用于nearbyint参数减少期间的函数使用。与相似rint，此函数指定为根据当前舍入模式进行舍入。当fenv.h 不使用时，舍入模式默认为圆形“到最近的或偶数”。使用该功能时，存在直接取整模式生效的风险。可以通过使用解决此问题round，该方法始终提供舍入模式“四舍五入至最接近，从零开始的联系”，而与当前舍入模式无关。但是，由于大多数处理器体系结构上的等效机器指令均不支持此功能，因此该功能往往会变慢。

关于性能的说明：下面的C99代码在很大程度上依赖于的使用fma()，该实现了一个融合的乘加运算。在大多数现代硬件体系结构上，这由相应的硬件指令直接支持。在这种情况下，由于FMA仿真速度通常较慢，因此代码可能会出现严重的性能下降。

 #include <math.h>
 #include <stdint.h>

/* Writes result sine result sin(?a) to the location pointed to by sp
   Writes result cosine result cos(?a) to the location pointed to by cp

   In extensive testing, no errors > 0.97 ulp were found in either the sine
   or cosine results, suggesting the results returned are faithfully rounded.
*/
void my_sincospi (double a, double *sp, double *cp)
{
    double c, r, s, t, az;
    int64_t i;

    az = a * 0.0; // must be evaluated with IEEE-754 semantics
    /* for |a| >= 2**53, cospi(a) = 1.0, but cospi(Inf) = NaN */
    a = (fabs (a) < 9.0071992547409920e+15) ? a : az;  // 0x1.0p53
    /* reduce argument to primary approximation interval (-0.25, 0.25) */
    r = nearbyint (a + a); // must use IEEE-754 "to nearest" rounding
    i = (int64_t)r;
    t = fma (-0.5, r, a);
    /* compute core approximations */
    s = t * t;
    /* Approximate cos(pi*x) for x in [-0.25,0.25] */
    r =            -1.0369917389758117e-4;
    r = fma (r, s,  1.9294935641298806e-3);
    r = fma (r, s, -2.5806887942825395e-2);
    r = fma (r, s,  2.3533063028328211e-1);
    r = fma (r, s, -1.3352627688538006e+0);
    r = fma (r, s,  4.0587121264167623e+0);
    r = fma (r, s, -4.9348022005446790e+0);
    c = fma (r, s,  1.0000000000000000e+0);
    /* Approximate sin(pi*x) for x in [-0.25,0.25] */
    r =             4.6151442520157035e-4;
    r = fma (r, s, -7.3700183130883555e-3);
    r = fma (r, s,  8.2145868949323936e-2);
    r = fma (r, s, -5.9926452893214921e-1);
    r = fma (r, s,  2.5501640398732688e+0);
    r = fma (r, s, -5.1677127800499516e+0);
    s = s * t;
    r = r * s;
    s = fma (t, 3.1415926535897931e+0, r);
    /* map results according to quadrant */
    if (i & 2) {
        s = 0.0 - s; // must be evaluated with IEEE-754 semantics
        c = 0.0 - c; // must be evaluated with IEEE-754 semantics
    }
    if (i & 1) { 
        t = 0.0 - s; // must be evaluated with IEEE-754 semantics
        s = c;
        c = t;
    }
    /* IEEE-754: sinPi(+n) is +0 and sinPi(-n) is -0 for positive integers n */
    if (a == floor (a)) s = az;
    *sp = s;
    *cp = c;
}

Run Code Online (Sandbox Code Playgroud)

单精度版本基本上仅在核心近似上有所不同。使用详尽的测试可以精确确定误差范围。

#include <math.h>
#include <stdint.h>

/* Writes result sine result sin(?a) to the location pointed to by sp
   Writes result cosine result cos(?a) to the location pointed to by cp

   In exhaustive testing, the maximum error in sine results was 0.96677 ulp,
   the maximum error in cosine results was 0.96563 ulp, meaning results are
   faithfully rounded.
*/
void my_sincospif (float a, float *sp, float *cp)
{
    float az, t, c, r, s;
    int32_t i;

    az = a * 0.0f; // must be evaluated with IEEE-754 semantics
    /* for |a| > 2**24, cospi(a) = 1.0f, but cospi(Inf) = NaN */
    a = (fabsf (a) < 0x1.0p24f) ? a : az;
    r = nearbyintf (a + a); // must use IEEE-754 "to nearest" rounding
    i = (int32_t)r;
    t = fmaf (-0.5f, r, a);
    /* compute core approximations */
    s = t * t;
    /* Approximate cos(pi*x) for x in [-0.25,0.25] */
    r =              0x1.d9e000p-3f;
    r = fmaf (r, s, -0x1.55c400p+0f);
    r = fmaf (r, s,  0x1.03c1cep+2f);
    r = fmaf (r, s, -0x1.3bd3ccp+2f);
    c = fmaf (r, s,  0x1.000000p+0f);
    /* Approximate sin(pi*x) for x in [-0.25,0.25] */
    r =             -0x1.310000p-1f;
    r = fmaf (r, s,  0x1.46737ep+1f);
    r = fmaf (r, s, -0x1.4abbfep+2f);
    r = (t * s) * r;
    s = fmaf (t, 0x1.921fb6p+1f, r);
    if (i & 2) {
        s = 0.0f - s; // must be evaluated with IEEE-754 semantics
        c = 0.0f - c; // must be evaluated with IEEE-754 semantics
    }
    if (i & 1) {
        t = 0.0f - s; // must be evaluated with IEEE-754 semantics
        s = c;
        c = t;
    }
    /* IEEE-754: sinPi(+n) is +0 and sinPi(-n) is -0 for positive integers n */
    if (a == floorf (a)) s = az;
    *sp = s;
    *cp = c;
}

Run Code Online (Sandbox Code Playgroud)

@JohnBollinger我不知道。*如果*工具链根据IEEE-754规则对浮点格式和转换提供了足够的控制，则*然后*此代码相对于IEEE-754正常工作（最好我可以测试一下）。相反，如果工具链通常*不*符合IEEE-754，则此代码*不应符合IEEE-754的所有要求（也没有必要）。 (2认同)
@MatíasGiovannini重新排序会导致最大ulp误差增加（有时约为〜1.5 ulp），因此不再忠实地实现（这是我的设计目标）。在某些情况下，这可能是可以接受的。 (2认同)
@MBReynolds 这是可以预料的。根据我负责标准数学库的经验，进行单精度计算的人首先渴望的是性能（3D 图形和信号处理是重要的应用领域）。对于这些人来说，超出以性能为代价的忠实全面实现的准确性并不具有吸引力。相反，进行双精度计算（许多科学计算任务）的人通常希望精度接近 0.5 ulp；添加一些操作来提高到该水平通常是受欢迎的。 (2认同)

归档时间：	8 年，12 月前
查看次数：	1171 次
最近记录：	8 年，12 月前