使用标准 C 数学库精确计算 Lambert W 函数主分支

nju*_*ffa 4 c algorithm math floating-point floating-accuracy

兰伯特W函数是f(w) = we w的反函数。它是一个多值函数,在复数上有无限多个分支,但在实数上只有两个分支,用 W 0和 W -1表示。W 0被认为是主分支,输入域为 [ -1/e, \xe2\x88\x9e ],W -1的输入域为 ( -1/e, 0 )。相应的实现通常称为\nlambert_w0()lambert_wm1()

\n

该函数的近亲首先由Leonard Euler [1]在跟进Johann Heinrich Lambert [2]的工作时发现。欧拉研究了超越方程x \xce\xb1 -x \xce\xb2 = (\xce\xb1 - \xce\xb2) vx \xce\xb1+\xce\xb2的解 ,并在此过程中考虑了简化情况ln x = vx \xce\xb1。在此过程中,他引入了一个辅助函数,该函数具有以下围绕零的级数展开式:

\n

y = 1 + (2 1 /(1\xc2\xb72))u + (3 2 /(1\xc2\xb72\xc2\xb73))u 2 + (4 3 /(1\xc2\xb72\xc2\ xb73\xc2\xb74))u 3 + (5 4 /((1\xc2\xb72 \xe2\x80\xa6 \xc2\xb75))u 4 + \xe2\x80\xa6

\n

用现代术语来说,这个函数(高斯未命名)表示W(-x)/x , ln x = vx \xce\xb1的解为x=(-W(-\xce\xb1v)/-\xce\xb1v) (1/\xce\xb1)

\n

虽然兰伯特 W 函数偶尔出现在文献中,例如[3] ,但直到 20 世纪 90 年代 Robert Corless 的开创性工作(例如[4])才被命名并被认为是一个重要的构建模块。随后,通过持续的研究,Lambert W 函数在数学和物理科学中的适用性得到了扩展,[5]中给出了一些例子。

\n

Lambert W 函数目前并不是 ISO-C 标准数学库的一部分,似乎也没有任何立即添加它的计划。*如何使用 ISO-C 标准数学库准确实现Lambert W 函数的主要分支 W 0 ?

\n

忠实全面的实现可能过于雄心勃勃,但维持 4 ulp 错误界限(由英特尔数学库的 LA 配置文件选择)似乎是可以实现和理想的。可以假定支持 IEEE-754 (2008) 二进制浮点算术,并支持可通过fma()和标准库函数访问的融合乘加 (FMA) 运算。fmaf()

\n
\n

[1] Leonard Euler, \xe2\x80\x9cDe serie Lambertina plurimisque eius insignibus proprietatibus,\xe2\x80\x9d (论 Lambert\xe2\x80\x99s 系列及其许多独特的特性) Acta Academiae Scientiarum Imperialis Petropolitanae pro Anno MDCCLXXIX , Tomus III,Pars II,(1779 年圣彼得堡帝国科学院院刊,第 3 卷,第 2 部分,7 月至 12 月),圣彼得堡:科学院 1783 年,第 29-51 页(慕尼黑巴伐利亚州立图书馆在线扫描)

\n

[2] Johann Heinrich Lambert,“Observationes variae in mathesin puram”(纯数学的各种观察)Acta Helveticae Physico-Mathematico-anatomico-botanico-medica,Vol.1。3,巴塞尔:JR Imhof 1758,第 128-168 页(在生物多样性遗产图书馆在线扫描)

\n

[3] FN Fritsch、RE Shafer 和 WP Crowley,“算法 443:超越方程we x =x的解”,ACM 通信,卷。16,第 2 期,1973 年 2 月,第 123-124 页。

\n

[4] RM Corless 等人,“论 Lambert W 函数”,计算数学进展,卷。5,第 1 期,1996 年 12 月,第 329-359 页

\n

[5] Iordanis Kesisoglou、Garima Singh 和 Michael Nikolaou,“兰伯特函数应该出现在工程数学工具箱中”,计算机与化学工程,148,2021 年 5 月

\n

nju*_*ffa 7

从文献中可以清楚地看出,在实数上计算 Lambert W 函数的最常见方法是通过函数迭代。二阶牛顿法是这些方案中最简单的:

\n

w i+1 = w i - (w i exp(w i ) - x) /\n(exp (w i ) + w i exp(w i ))。

\n

许多文献更喜欢高阶方法,例如 Halley、Fritsch 和 Schroeder 的方法。在探索性工作中,我发现当以有限精度浮点运算执行时,这些高阶迭代的数值特性并不像阶数所暗示的那么有利。作为一个特定的例子,三个牛顿迭代在准确性方面始终优于两个哈雷迭代。出于这个原因,我选择牛顿迭代作为主要构建块,并使用了自定义实现exp()只需要提供可表示为 IEEE-754 术语中的正法线的结果即可恢复一些性能。

\n

我进一步发现,对于大操作数,牛顿迭代只会缓慢收敛,特别是当起始近似值不是很准确时。由于高迭代次数不利于获得良好的性能,因此我四处寻找替代方案,并在 Iacono 和 Boyd [*]的基于对数的迭代方案中找到了一个优秀的候选方案,该方案也具有二阶收敛性:

\n

w i+1 = (w i / (1 + w i )) * (1 + log (x / w i )

\n

Lambert W 函数的许多实现似乎对输入域的不同部分使用不同的起始近似值。我的偏好是在整个输入域中使用单一的起始近似值,以考虑未来的矢量化实现。

\n

幸运的是,Iacono 和 Boyd 还提供了一种适用于 W 0整个输入域的通用起始近似,虽然没有完全实现其承诺,但性能非常好。我针对单精度实现对此进行了微调,该实现处理更窄的输入域,使用优化搜索启发式来实现最佳的准确性。我还采用了自定义实现,log()只需处理正法线的输入。

\n

在起始近似和牛顿迭代中必须小心,以避免中间计算中的上溢和下溢。这可以通过适当的 2 次方缩放来轻松且廉价地实现。

\n

虽然最终的迭代方案通常会提供准确的结果,但对于接近零的参数和接近-1/e \xe2\x89\x88 -0.367879 的参数,会出现许多 ulps 的错误。我通过使用围绕零的泰勒级数展开式的前几项来解决前一个问题:x - x 2 + (3/2)x 3。事实上,[ -1/e, 0 ] 上的W 0 \xe2\x89\x88 \xe2\x88\x9a(1+ex)-1表明使用极小极大多项式近似p(t)t=\xe2 \x88\x9a(x+1/e)事实证明在-1/e附近工作得相当好。我用 Remez 算法生成了这个近似值。

\n

binary32映射到 的IEEE-754floatbinary64映射到 的IEEE-754所实现的精度double完全在指定的误差范围内。正半平面最大误差小于1.5 ulps,负半平面最大误差小于2.7 ulps。单精度代码经过了详尽的测试,而双精度代码则使用了十亿个随机参数进行了测试。

\n
\n

[*] 罗伯托·伊科诺和约翰·P·博伊德。“兰伯特 W 函数主要实值分支的新近似值。” 计算数学进展,卷。43,第6期,2017年12月,第1403-1436页。

\n

Lambert W 0函数的单精度实现如下:

\n
float expf_scale_pos_normal (float a, int scale);\nfloat logf_pos_normal (float a);\n\n/*\n  Compute the principal branch of the Lambert W function, W_0. The maximum\n  error in the positive half-plane is 1.49874 ulps and the maximum error in \n  the negative half-plane is 2.56002 ulps\n*/\nfloat lambert_w0f (float z) \n{\n    const float em1_fact_0 = 0.625529587f; // exp(-1)_factor_0\n    const float em1_fact_1 = 0.588108778f; // exp(-1)_factor_1\n    const float qe1 = 2.71828183f / 4.0f; //  0x1.5bf0a8p-01 // exp(1)/4\n    float e, w, num, den, rden, redz, y, r;\n\n    if (isnan (z) || (z == INFINITY) || (z == 0.0f)) return z + z;\n    if (fabsf (z) < 1.220703125e-4f) return fmaf (-z, z, z); // 0x1.0p-13\n    redz = fmaf (em1_fact_0, em1_fact_1, z); // z + exp(-1)\n    if (redz < 0.0625f) { // expansion at -(exp(-1))\n        r = sqrtf (redz);\n        w =             -1.23046875f;  // -0x1.3b0000p+0\n        w = fmaf (w, r,  2.17185670f); //  0x1.15ff66p+1\n        w = fmaf (w, r, -2.19554094f); // -0x1.19077cp+1\n        w = fmaf (w, r,  1.92107077f); //  0x1.ebcb4cp+0\n        w = fmaf (w, r, -1.81141856f); // -0x1.cfb920p+0\n        w = fmaf (w, r,  2.33162979f); //  0x1.2a72d8p+1\n        w = fmaf (w, r, -1.00000000f); // -0x1.000000p+0\n    } else {\n        /* Compute initial approximation. Based on: Roberto Iacono and John \n           Philip Boyd, "New approximations to the principal real-valued branch\n           of the Lambert W function", Advances in Computational Mathematics, \n           Vol. 43, No. 6, December 2017, pp. 1403-1436\n        */\n        y = fmaf (2.0f, sqrtf (fmaf (qe1, z, 0.25f)), 1.0f);\n        y = logf_pos_normal (fmaf (1.15262585f, y, -0.15262585f) / \n                             fmaf (0.45906518f, logf_pos_normal (y), 1.0f));\n        w = fmaf (2.0390625f, y, -1.0f);\n\n        /* perform Newton iterations to refine approximation to full accuracy */\n        for (int i = 0; i < 3; i++) {\n            e = expf_scale_pos_normal (w, -3); // 0.125f * expf (w);\n            num = fmaf (w, e, -0.125f * z);\n            den = fmaf (w, e, e);\n            rden = 1.0f / den;\n            w = fmaf (-num, rden, w);\n        }\n    }\n    return w;\n}\n\nfloat uint32_as_float (uint32_t a)\n{\n    float r;\n    memcpy (&r, &a, sizeof r);\n    return r;\n}\n\nuint32_t float_as_uint32 (float a)\n{\n    uint32_t r;\n    memcpy (&r, &a, sizeof r);\n    return r;\n}\n\n/* exp(a) * 2**scale; positive normal results only! Maximum error 0.86565 ulp */\nfloat expf_scale_pos_normal (float a, int scale)\n{\n    const float flt_int_cvt = 12582912.0f; // 0x1.8p23 \n    float f, r, j, t;\n    uint32_t i;\n\n    /* exp(a) = 2**i * exp(f); i = rintf (a / log(2)) */\n    j = fmaf (1.442695f, a, flt_int_cvt); // // 0x1.715476p0 // log2(e)\n    t = j - flt_int_cvt; \n    f = fmaf (t, -6.93145752e-1f, a); // -0x1.62e400p-1  // log_2_hi \n    f = fmaf (t, -1.42860677e-6f, f); // -0x1.7f7d1cp-20 // log_2_lo \n    i = float_as_uint32 (j);\n\n    /* approximate r = exp(f) on interval [-log(2)/2, +log(2)/2] */\n    r =             1.37805939e-3f;  // 0x1.694000p-10\n    r = fmaf (r, f, 8.37312452e-3f); // 0x1.125edcp-7\n    r = fmaf (r, f, 4.16695364e-2f); // 0x1.555b5ap-5\n    r = fmaf (r, f, 1.66664720e-1f); // 0x1.555450p-3\n    r = fmaf (r, f, 4.99999851e-1f); // 0x1.fffff6p-2\n    r = fmaf (r, f, 1.00000000e+0f); // 0x1.000000p+0\n    r = fmaf (r, f, 1.00000000e+0f); // 0x1.000000p+0\n\n    /* exp(a) = 2**(i+scale) * r; */\n    r = uint32_as_float (((i + scale) << 23) + float_as_uint32 (r));\n    return r;\n}\n\n/* compute natural logarithm of positive normals; maximum error: 0.85089 ulp */\nfloat logf_pos_normal (float a)\n{\n    const float ln2 = 0.693147182f; // 0x1.62e430p-1 // log(2)\n    const float two_to_m23 = 1.19209290e-7f; // 0x1.0p-23\n    float m, r, s, t, i, f;\n    int32_t e;\n\n    /* log(a) = log(m * 2**i) = i * log(2) + log(m) */\n    e = (float_as_uint32 (a) - float_as_uint32 (0.666666667f)) & 0xff800000;\n    m = uint32_as_float (float_as_uint32 (a) - e);\n    i = (float)e * two_to_m23;\n\n    /* log(m) = log1p(f) */\n    f = m - 1.0f;\n    s = f * f;\n\n    /* compute log1p(f) for f in [-1/3, 1/3] */\n    r =             -0.130310059f;  // -0x1.0ae000p-3\n    t =              0.140869141f;  //  0x1.208000p-3\n    r = fmaf (r, s, -0.121483363f); // -0x1.f1988ap-4\n    t = fmaf (t, s,  0.139814854f); //  0x1.1e5740p-3\n    r = fmaf (r, s, -0.166846141f); // -0x1.55b36ep-3\n    t = fmaf (t, s,  0.200120345f); //  0x1.99d8b2p-3\n    r = fmaf (r, s, -0.249996200f); // -0x1.fffe02p-3\n    r = fmaf (t, f, r);\n    r = fmaf (r, f,  0.333331972f); //  0x1.5554fap-2\n    r = fmaf (r, f, -0.500000000f); // -0x1.000000p-1\n    r = fmaf (r, s, f);\n\n    /* log(a) = i * log(2) + log(m) */\n    r = fmaf (i, ln2, r);\n    return r;\n}\n
Run Code Online (Sandbox Code Playgroud)\n

双精度实现在结构上与单精度实现等效,只是它使用 Iacono-Boyd 迭代方案:

\n
double exp_scale_pos_normal (double a, int scale);\ndouble log_pos_normal (double a);\n\n/* Compute the principal branch of the Lambert W function, W_0. Maximum error:\n   positive half-plane: 1.49210 ulp\n   negative half-plane: 2.67824 ulp\n*/\ndouble lambert_w0 (double z) \n{\n    const double em1_fact_0 = 0.57086272525975246; // 0x1.24481e7efdfcep-1 // exp(-1)_factor_0\n    const double em1_fact_1 = 0.64442715366299452; // 0x1.49f25b1b461b7p-1 // exp(-1)_factor_1\n    const double qe1 = 2.7182818284590452 * 0.25;  // 0x1.5bf0a8b145769p-1 // exp(1)/4\n    double e, r, t, w, y, num, den, rden, redz;\n    int i;\n    \n    if (isnan (z) || (z == INFINITY) || (z == 0.0)) return z + z;\n    if (fabs (z) < 1.9073486328125e-6) return fma (fma (1.5, z, -1.) * z, z, z);\n    redz = fma (em1_fact_0, em1_fact_1, z); // z + exp(-1)\n    if (redz < 0.01025390625) { // expansion at -(exp(-1))\n        r = sqrt (redz);\n        w =            -7.8466654751155138;  // -0x1.f62fc463917ffp+2\n        w = fma (w, r, 10.0241581340373877); //  0x1.40c5e74773ef5p+3\n        w = fma (w, r, -8.1029379749359691); // -0x1.034b44947bba0p+3\n        w = fma (w, r,  5.8322883145113726); //  0x1.75443634ead5fp+2\n        w = fma (w, r, -4.1738796362609882); // -0x1.0b20d80dcb9acp+2\n        w = fma (w, r,  3.0668053943936471); //  0x1.888d14440efd0p+1\n        w = fma (w, r, -2.3535499689514934); // -0x1.2d41201913016p+1\n        w = fma (w, r,  1.9366310979331112); //  0x1.efc70e3e0a0eap+0\n        w = fma (w, r, -1.8121878855270763); // -0x1.cfeb8b968bd2cp+0\n        w = fma (w, r,  2.3316439815968506); //  0x1.2a734f5b6fd56p+1\n        w = fma (w, r, -1.0000000000000000); // -0x1.0000000000000p+0\n        return w;\n    }\n    /* Roberto Iacono and John Philip Boyd, "New approximations to the \n       principal real-valued branch of the Lambert W function", Advances \n       in Computational Mathematics, Vol. 43, No. 6, December 2017, \n       pp. 1403-1436\n     */\n    y = fma (2.0, sqrt (fma (qe1, z, 0.25)), 1.0);\n    y = log_pos_normal (fma (1.14956131, y, -0.14956131) / \n                        fma (0.4549574, log_pos_normal (y), 1.0));\n    w = fma (2.036, y, -1.0);\n\n    /* Use iteration scheme w = (w / (1 + w)) * (1 + log (z / w) from\n       Roberto Iacono and John Philip Boyd, "New approximations to the \n       principal real-valued branch of the Lambert W function", Advances\n       in Computational Mathematics, Vol. 43, No. 6, December 2017, pp. \n       1403-1436\n    */\n    for (i = 0; i < 3; i++) {\n        t = w / (1.0 + w);\n        w = fma (log_pos_normal (z / w), t, t);\n    }\n\n    /* Fine tune approximation with a single Newton iteration */\n    e = exp_scale_pos_normal (w, -3); // 0.125 * exp (w)\n    num = fma (w, e, -0.125 *z);\n    den = fma (w, e, e);\n    rden = 1.0 / den;\n    w = fma (-num, rden, w);\n\n    return w;\n}\n\nint double2hiint (double a)\n{\n    unsigned long long int t;\n    memcpy (&t, &a, sizeof t);\n    return (int)(t >> 32);\n}\n\nint double2loint (double a)\n{\n    unsigned long long int t;\n    memcpy (&t, &a, sizeof t);\n    return (int)(unsigned int)t;\n}\n\ndouble hiloint2double (int hi, int lo) \n{\n    double r;\n    unsigned long long int t;\n    t = ((unsigned long long int)(unsigned int)hi << 32) | (unsigned int)lo;\n    memcpy (&r, &t, sizeof r);\n    return r;\n}\n\n/* exp(a) * 2**scale; pos. normal results only! Max. err. found: 0.89028 ulp */\ndouble exp_scale_pos_normal (double a, int scale)\n{\n    const double ln2_hi = 6.9314718055829871e-01; // 0x1.62e42fefa00000p-01\n    const double ln2_lo = 1.6465949582897082e-12; // 0x1.cf79abc9e3b3a0p-40\n    const double l2e = 1.4426950408889634; // 0x1.71547652b82fe0p+00 // log2(e)\n    const double cvt = 6755399441055744.0; // 0x1.80000000000000p+52 // 3*2**51\n    double f, r;\n    int i;\n\n    /* exp(a) = exp(i + f); i = rint (a / log(2)) */\n    r = fma (l2e, a, cvt);\n    i = double2loint (r);\n    r = r - cvt;\n    f = fma (r, -ln2_hi, a);\n    f = fma (r, -ln2_lo, f);\n\n    /* approximate r = exp(f) on interval [-log(2)/2,+log(2)/2] */\n    r =            2.5022018235176802e-8;  // 0x1.ade0000000000p-26\n    r = fma (r, f, 2.7630903497145818e-7); // 0x1.28af3fcbbf09bp-22\n    r = fma (r, f, 2.7557514543490574e-6); // 0x1.71dee623774fap-19\n    r = fma (r, f, 2.4801491039409158e-5); // 0x1.a01997c8b50d7p-16\n    r = fma (r, f, 1.9841269589068419e-4); // 0x1.a01a01475db8cp-13\n    r = fma (r, f, 1.3888888945916566e-3); // 0x1.6c16c1852b805p-10\n    r = fma (r, f, 8.3333333334557735e-3); // 0x1.11111111224c7p-7\n    r = fma (r, f, 4.1666666666519782e-2); // 0x1.55555555502a5p-5\n    r = fma (r, f, 1.6666666666666477e-1); // 0x1.5555555555511p-3\n    r = fma (r, f, 5.0000000000000122e-1); // 0x1.000000000000bp-1\n    r = fma (r, f, 1.0000000000000000e+0); // 0x1.0000000000000p+0\n    r = fma (r, f, 1.0000000000000000e+0); // 0x1.0000000000000p+0\n\n    /* exp(a) = 2**(i+scale) * r */\n    r = hiloint2double (double2hiint (r) + ((i + scale) << 20), \n                        double2loint (r));\n    return r;\n}\n\n/* compute natural logarithm of positive normals; max. err. found: 0.86902 ulp*/\ndouble log_pos_normal (double a)\n{\n    const double ln2_hi = 6.9314718055994529e-01; // 0x1.62e42fefa39efp-01\n    const double ln2_lo = 2.3190468138462996e-17; // 0x1.abc9e3b39803fp-56\n    double m, r, i, s, t, p, q;\n    int e;\n\n    /* log(a) = log(m * 2**i) = i * log(2) + log(m) */\n    e = (double2hiint (a) - double2hiint (0.70703125)) & 0xfff00000;\n    m = hiloint2double (double2hiint (a) - e, double2loint (a));\n    t = hiloint2double (0x41f00000, 0x80000000 ^ e);\n    i = t - (hiloint2double (0x41f00000, 0x80000000));\n\n    /* m now in [181/256, 362/256]. Compute q = (m-1) / (m+1) */\n    p = m + 1.0;\n    r = 1.0 / p;\n    q = fma (m, r, -r);\n    m = m - 1.0;\n\n    /* compute (2*atanh(q)/q-2*q) as p(q**2), q in [-75/437, 53/309] */\n    s = q * q;\n    r =            1.4794533702196025e-1;  // 0x1.2efdf700d7135p-3\n    r = fma (r, s, 1.5314187748152339e-1); // 0x1.39a272db730f7p-3\n    r = fma (r, s, 1.8183559141306990e-1); // 0x1.746637f2f191bp-3\n    r = fma (r, s, 2.2222198669309609e-1); // 0x1.c71c522a64577p-3\n    r = fma (r, s, 2.8571428741489319e-1); // 0x1.24924941c9a2fp-2\n    r = fma (r, s, 3.9999999999418523e-1); // 0x1.999999998006cp-2\n    r = fma (r, s, 6.6666666666667340e-1); // 0x1.5555555555592p-1\n    r = r * s;\n\n    /* log(a) = 2*atanh(q) + i*log(2) = ln2_lo*i + p(q**2)*q + 2q + ln2_hi * i.\n       Use K.C. Ng\'s trick to improve the accuracy of the computation, like so:\n       p(q**2)*q + 2q = p(q**2)*q + q*t - t + m, where t = m**2/2.\n    */\n    t = m * m * 0.5;\n    r = fma (q, t, fma (q, r, ln2_lo * i)) - t + m;\n    r = fma (ln2_hi, i, r);\n\n    return r;\n}\n
Run Code Online (Sandbox Code Playgroud)\n