nju*_*ffa 4 c algorithm math floating-point floating-accuracy
兰伯特W函数是f(w) = we w的反函数。它是一个多值函数,在复数上有无限多个分支,但在实数上只有两个分支,用 W 0和 W -1表示。W 0被认为是主分支,输入域为 [ -1/e, \xe2\x88\x9e ],W -1的输入域为 ( -1/e, 0 )。相应的实现通常称为\nlambert_w0()和lambert_wm1()。
该函数的近亲首先由Leonard Euler [1]在跟进Johann Heinrich Lambert [2]的工作时发现。欧拉研究了超越方程x \xce\xb1 -x \xce\xb2 = (\xce\xb1 - \xce\xb2) vx \xce\xb1+\xce\xb2的解 ,并在此过程中考虑了简化情况ln x = vx \xce\xb1。在此过程中,他引入了一个辅助函数,该函数具有以下围绕零的级数展开式:
\ny = 1 + (2 1 /(1\xc2\xb72))u + (3 2 /(1\xc2\xb72\xc2\xb73))u 2 + (4 3 /(1\xc2\xb72\xc2\ xb73\xc2\xb74))u 3 + (5 4 /((1\xc2\xb72 \xe2\x80\xa6 \xc2\xb75))u 4 + \xe2\x80\xa6
\n用现代术语来说,这个函数(高斯未命名)表示W(-x)/x , ln x = vx \xce\xb1的解为x=(-W(-\xce\xb1v)/-\xce\xb1v) (1/\xce\xb1)。
\n虽然兰伯特 W 函数偶尔出现在文献中,例如[3] ,但直到 20 世纪 90 年代 Robert Corless 的开创性工作(例如[4])才被命名并被认为是一个重要的构建模块。随后,通过持续的研究,Lambert W 函数在数学和物理科学中的适用性得到了扩展,[5]中给出了一些例子。
\nLambert W 函数目前并不是 ISO-C 标准数学库的一部分,似乎也没有任何立即添加它的计划。*如何使用 ISO-C 标准数学库准确实现Lambert W 函数的主要分支 W 0 ?
\n忠实全面的实现可能过于雄心勃勃,但维持 4 ulp 错误界限(由英特尔数学库的 LA 配置文件选择)似乎是可以实现和理想的。可以假定支持 IEEE-754 (2008) 二进制浮点算术,并支持可通过fma()和标准库函数访问的融合乘加 (FMA) 运算。fmaf()
[1] Leonard Euler, \xe2\x80\x9cDe serie Lambertina plurimisque eius insignibus proprietatibus,\xe2\x80\x9d (论 Lambert\xe2\x80\x99s 系列及其许多独特的特性) Acta Academiae Scientiarum Imperialis Petropolitanae pro Anno MDCCLXXIX , Tomus III,Pars II,(1779 年圣彼得堡帝国科学院院刊,第 3 卷,第 2 部分,7 月至 12 月),圣彼得堡:科学院 1783 年,第 29-51 页(慕尼黑巴伐利亚州立图书馆在线扫描)
\n[2] Johann Heinrich Lambert,“Observationes variae in mathesin puram”(纯数学的各种观察)Acta Helveticae Physico-Mathematico-anatomico-botanico-medica,Vol.1。3,巴塞尔:JR Imhof 1758,第 128-168 页(在生物多样性遗产图书馆在线扫描)
\n[3] FN Fritsch、RE Shafer 和 WP Crowley,“算法 443:超越方程we x =x的解”,ACM 通信,卷。16,第 2 期,1973 年 2 月,第 123-124 页。
\n[4] RM Corless 等人,“论 Lambert W 函数”,计算数学进展,卷。5,第 1 期,1996 年 12 月,第 329-359 页
\n[5] Iordanis Kesisoglou、Garima Singh 和 Michael Nikolaou,“兰伯特函数应该出现在工程数学工具箱中”,计算机与化学工程,148,2021 年 5 月
\n从文献中可以清楚地看出,在实数上计算 Lambert W 函数的最常见方法是通过函数迭代。二阶牛顿法是这些方案中最简单的:
\nw i+1 = w i - (w i exp(w i ) - x) /\n(exp (w i ) + w i exp(w i ))。
\n许多文献更喜欢高阶方法,例如 Halley、Fritsch 和 Schroeder 的方法。在探索性工作中,我发现当以有限精度浮点运算执行时,这些高阶迭代的数值特性并不像阶数所暗示的那么有利。作为一个特定的例子,三个牛顿迭代在准确性方面始终优于两个哈雷迭代。出于这个原因,我选择牛顿迭代作为主要构建块,并使用了自定义实现exp()只需要提供可表示为 IEEE-754 术语中的正法线的结果即可恢复一些性能。
我进一步发现,对于大操作数,牛顿迭代只会缓慢收敛,特别是当起始近似值不是很准确时。由于高迭代次数不利于获得良好的性能,因此我四处寻找替代方案,并在 Iacono 和 Boyd [*]的基于对数的迭代方案中找到了一个优秀的候选方案,该方案也具有二阶收敛性:
\nw i+1 = (w i / (1 + w i )) * (1 + log (x / w i )
\nLambert W 函数的许多实现似乎对输入域的不同部分使用不同的起始近似值。我的偏好是在整个输入域中使用单一的起始近似值,以考虑未来的矢量化实现。
\n幸运的是,Iacono 和 Boyd 还提供了一种适用于 W 0整个输入域的通用起始近似,虽然没有完全实现其承诺,但性能非常好。我针对单精度实现对此进行了微调,该实现处理更窄的输入域,使用优化搜索启发式来实现最佳的准确性。我还采用了自定义实现,log()只需处理正法线的输入。
在起始近似和牛顿迭代中必须小心,以避免中间计算中的上溢和下溢。这可以通过适当的 2 次方缩放来轻松且廉价地实现。
\n虽然最终的迭代方案通常会提供准确的结果,但对于接近零的参数和接近-1/e \xe2\x89\x88 -0.367879 的参数,会出现许多 ulps 的错误。我通过使用围绕零的泰勒级数展开式的前几项来解决前一个问题:x - x 2 + (3/2)x 3。事实上,[ -1/e, 0 ] 上的W 0 \xe2\x89\x88 \xe2\x88\x9a(1+ex)-1表明使用极小极大多项式近似p(t)且t=\xe2 \x88\x9a(x+1/e)事实证明在-1/e附近工作得相当好。我用 Remez 算法生成了这个近似值。
\nbinary32映射到 的IEEE-754float和binary64映射到 的IEEE-754所实现的精度double完全在指定的误差范围内。正半平面最大误差小于1.5 ulps,负半平面最大误差小于2.7 ulps。单精度代码经过了详尽的测试,而双精度代码则使用了十亿个随机参数进行了测试。
[*] 罗伯托·伊科诺和约翰·P·博伊德。“兰伯特 W 函数主要实值分支的新近似值。” 计算数学进展,卷。43,第6期,2017年12月,第1403-1436页。
\nLambert W 0函数的单精度实现如下:
\nfloat expf_scale_pos_normal (float a, int scale);\nfloat logf_pos_normal (float a);\n\n/*\n Compute the principal branch of the Lambert W function, W_0. The maximum\n error in the positive half-plane is 1.49874 ulps and the maximum error in \n the negative half-plane is 2.56002 ulps\n*/\nfloat lambert_w0f (float z) \n{\n const float em1_fact_0 = 0.625529587f; // exp(-1)_factor_0\n const float em1_fact_1 = 0.588108778f; // exp(-1)_factor_1\n const float qe1 = 2.71828183f / 4.0f; // 0x1.5bf0a8p-01 // exp(1)/4\n float e, w, num, den, rden, redz, y, r;\n\n if (isnan (z) || (z == INFINITY) || (z == 0.0f)) return z + z;\n if (fabsf (z) < 1.220703125e-4f) return fmaf (-z, z, z); // 0x1.0p-13\n redz = fmaf (em1_fact_0, em1_fact_1, z); // z + exp(-1)\n if (redz < 0.0625f) { // expansion at -(exp(-1))\n r = sqrtf (redz);\n w = -1.23046875f; // -0x1.3b0000p+0\n w = fmaf (w, r, 2.17185670f); // 0x1.15ff66p+1\n w = fmaf (w, r, -2.19554094f); // -0x1.19077cp+1\n w = fmaf (w, r, 1.92107077f); // 0x1.ebcb4cp+0\n w = fmaf (w, r, -1.81141856f); // -0x1.cfb920p+0\n w = fmaf (w, r, 2.33162979f); // 0x1.2a72d8p+1\n w = fmaf (w, r, -1.00000000f); // -0x1.000000p+0\n } else {\n /* Compute initial approximation. Based on: Roberto Iacono and John \n Philip Boyd, "New approximations to the principal real-valued branch\n of the Lambert W function", Advances in Computational Mathematics, \n Vol. 43, No. 6, December 2017, pp. 1403-1436\n */\n y = fmaf (2.0f, sqrtf (fmaf (qe1, z, 0.25f)), 1.0f);\n y = logf_pos_normal (fmaf (1.15262585f, y, -0.15262585f) / \n fmaf (0.45906518f, logf_pos_normal (y), 1.0f));\n w = fmaf (2.0390625f, y, -1.0f);\n\n /* perform Newton iterations to refine approximation to full accuracy */\n for (int i = 0; i < 3; i++) {\n e = expf_scale_pos_normal (w, -3); // 0.125f * expf (w);\n num = fmaf (w, e, -0.125f * z);\n den = fmaf (w, e, e);\n rden = 1.0f / den;\n w = fmaf (-num, rden, w);\n }\n }\n return w;\n}\n\nfloat uint32_as_float (uint32_t a)\n{\n float r;\n memcpy (&r, &a, sizeof r);\n return r;\n}\n\nuint32_t float_as_uint32 (float a)\n{\n uint32_t r;\n memcpy (&r, &a, sizeof r);\n return r;\n}\n\n/* exp(a) * 2**scale; positive normal results only! Maximum error 0.86565 ulp */\nfloat expf_scale_pos_normal (float a, int scale)\n{\n const float flt_int_cvt = 12582912.0f; // 0x1.8p23 \n float f, r, j, t;\n uint32_t i;\n\n /* exp(a) = 2**i * exp(f); i = rintf (a / log(2)) */\n j = fmaf (1.442695f, a, flt_int_cvt); // // 0x1.715476p0 // log2(e)\n t = j - flt_int_cvt; \n f = fmaf (t, -6.93145752e-1f, a); // -0x1.62e400p-1 // log_2_hi \n f = fmaf (t, -1.42860677e-6f, f); // -0x1.7f7d1cp-20 // log_2_lo \n i = float_as_uint32 (j);\n\n /* approximate r = exp(f) on interval [-log(2)/2, +log(2)/2] */\n r = 1.37805939e-3f; // 0x1.694000p-10\n r = fmaf (r, f, 8.37312452e-3f); // 0x1.125edcp-7\n r = fmaf (r, f, 4.16695364e-2f); // 0x1.555b5ap-5\n r = fmaf (r, f, 1.66664720e-1f); // 0x1.555450p-3\n r = fmaf (r, f, 4.99999851e-1f); // 0x1.fffff6p-2\n r = fmaf (r, f, 1.00000000e+0f); // 0x1.000000p+0\n r = fmaf (r, f, 1.00000000e+0f); // 0x1.000000p+0\n\n /* exp(a) = 2**(i+scale) * r; */\n r = uint32_as_float (((i + scale) << 23) + float_as_uint32 (r));\n return r;\n}\n\n/* compute natural logarithm of positive normals; maximum error: 0.85089 ulp */\nfloat logf_pos_normal (float a)\n{\n const float ln2 = 0.693147182f; // 0x1.62e430p-1 // log(2)\n const float two_to_m23 = 1.19209290e-7f; // 0x1.0p-23\n float m, r, s, t, i, f;\n int32_t e;\n\n /* log(a) = log(m * 2**i) = i * log(2) + log(m) */\n e = (float_as_uint32 (a) - float_as_uint32 (0.666666667f)) & 0xff800000;\n m = uint32_as_float (float_as_uint32 (a) - e);\n i = (float)e * two_to_m23;\n\n /* log(m) = log1p(f) */\n f = m - 1.0f;\n s = f * f;\n\n /* compute log1p(f) for f in [-1/3, 1/3] */\n r = -0.130310059f; // -0x1.0ae000p-3\n t = 0.140869141f; // 0x1.208000p-3\n r = fmaf (r, s, -0.121483363f); // -0x1.f1988ap-4\n t = fmaf (t, s, 0.139814854f); // 0x1.1e5740p-3\n r = fmaf (r, s, -0.166846141f); // -0x1.55b36ep-3\n t = fmaf (t, s, 0.200120345f); // 0x1.99d8b2p-3\n r = fmaf (r, s, -0.249996200f); // -0x1.fffe02p-3\n r = fmaf (t, f, r);\n r = fmaf (r, f, 0.333331972f); // 0x1.5554fap-2\n r = fmaf (r, f, -0.500000000f); // -0x1.000000p-1\n r = fmaf (r, s, f);\n\n /* log(a) = i * log(2) + log(m) */\n r = fmaf (i, ln2, r);\n return r;\n}\nRun Code Online (Sandbox Code Playgroud)\n双精度实现在结构上与单精度实现等效,只是它使用 Iacono-Boyd 迭代方案:
\ndouble exp_scale_pos_normal (double a, int scale);\ndouble log_pos_normal (double a);\n\n/* Compute the principal branch of the Lambert W function, W_0. Maximum error:\n positive half-plane: 1.49210 ulp\n negative half-plane: 2.67824 ulp\n*/\ndouble lambert_w0 (double z) \n{\n const double em1_fact_0 = 0.57086272525975246; // 0x1.24481e7efdfcep-1 // exp(-1)_factor_0\n const double em1_fact_1 = 0.64442715366299452; // 0x1.49f25b1b461b7p-1 // exp(-1)_factor_1\n const double qe1 = 2.7182818284590452 * 0.25; // 0x1.5bf0a8b145769p-1 // exp(1)/4\n double e, r, t, w, y, num, den, rden, redz;\n int i;\n \n if (isnan (z) || (z == INFINITY) || (z == 0.0)) return z + z;\n if (fabs (z) < 1.9073486328125e-6) return fma (fma (1.5, z, -1.) * z, z, z);\n redz = fma (em1_fact_0, em1_fact_1, z); // z + exp(-1)\n if (redz < 0.01025390625) { // expansion at -(exp(-1))\n r = sqrt (redz);\n w = -7.8466654751155138; // -0x1.f62fc463917ffp+2\n w = fma (w, r, 10.0241581340373877); // 0x1.40c5e74773ef5p+3\n w = fma (w, r, -8.1029379749359691); // -0x1.034b44947bba0p+3\n w = fma (w, r, 5.8322883145113726); // 0x1.75443634ead5fp+2\n w = fma (w, r, -4.1738796362609882); // -0x1.0b20d80dcb9acp+2\n w = fma (w, r, 3.0668053943936471); // 0x1.888d14440efd0p+1\n w = fma (w, r, -2.3535499689514934); // -0x1.2d41201913016p+1\n w = fma (w, r, 1.9366310979331112); // 0x1.efc70e3e0a0eap+0\n w = fma (w, r, -1.8121878855270763); // -0x1.cfeb8b968bd2cp+0\n w = fma (w, r, 2.3316439815968506); // 0x1.2a734f5b6fd56p+1\n w = fma (w, r, -1.0000000000000000); // -0x1.0000000000000p+0\n return w;\n }\n /* Roberto Iacono and John Philip Boyd, "New approximations to the \n principal real-valued branch of the Lambert W function", Advances \n in Computational Mathematics, Vol. 43, No. 6, December 2017, \n pp. 1403-1436\n */\n y = fma (2.0, sqrt (fma (qe1, z, 0.25)), 1.0);\n y = log_pos_normal (fma (1.14956131, y, -0.14956131) / \n fma (0.4549574, log_pos_normal (y), 1.0));\n w = fma (2.036, y, -1.0);\n\n /* Use iteration scheme w = (w / (1 + w)) * (1 + log (z / w) from\n Roberto Iacono and John Philip Boyd, "New approximations to the \n principal real-valued branch of the Lambert W function", Advances\n in Computational Mathematics, Vol. 43, No. 6, December 2017, pp. \n 1403-1436\n */\n for (i = 0; i < 3; i++) {\n t = w / (1.0 + w);\n w = fma (log_pos_normal (z / w), t, t);\n }\n\n /* Fine tune approximation with a single Newton iteration */\n e = exp_scale_pos_normal (w, -3); // 0.125 * exp (w)\n num = fma (w, e, -0.125 *z);\n den = fma (w, e, e);\n rden = 1.0 / den;\n w = fma (-num, rden, w);\n\n return w;\n}\n\nint double2hiint (double a)\n{\n unsigned long long int t;\n memcpy (&t, &a, sizeof t);\n return (int)(t >> 32);\n}\n\nint double2loint (double a)\n{\n unsigned long long int t;\n memcpy (&t, &a, sizeof t);\n return (int)(unsigned int)t;\n}\n\ndouble hiloint2double (int hi, int lo) \n{\n double r;\n unsigned long long int t;\n t = ((unsigned long long int)(unsigned int)hi << 32) | (unsigned int)lo;\n memcpy (&r, &t, sizeof r);\n return r;\n}\n\n/* exp(a) * 2**scale; pos. normal results only! Max. err. found: 0.89028 ulp */\ndouble exp_scale_pos_normal (double a, int scale)\n{\n const double ln2_hi = 6.9314718055829871e-01; // 0x1.62e42fefa00000p-01\n const double ln2_lo = 1.6465949582897082e-12; // 0x1.cf79abc9e3b3a0p-40\n const double l2e = 1.4426950408889634; // 0x1.71547652b82fe0p+00 // log2(e)\n const double cvt = 6755399441055744.0; // 0x1.80000000000000p+52 // 3*2**51\n double f, r;\n int i;\n\n /* exp(a) = exp(i + f); i = rint (a / log(2)) */\n r = fma (l2e, a, cvt);\n i = double2loint (r);\n r = r - cvt;\n f = fma (r, -ln2_hi, a);\n f = fma (r, -ln2_lo, f);\n\n /* approximate r = exp(f) on interval [-log(2)/2,+log(2)/2] */\n r = 2.5022018235176802e-8; // 0x1.ade0000000000p-26\n r = fma (r, f, 2.7630903497145818e-7); // 0x1.28af3fcbbf09bp-22\n r = fma (r, f, 2.7557514543490574e-6); // 0x1.71dee623774fap-19\n r = fma (r, f, 2.4801491039409158e-5); // 0x1.a01997c8b50d7p-16\n r = fma (r, f, 1.9841269589068419e-4); // 0x1.a01a01475db8cp-13\n r = fma (r, f, 1.3888888945916566e-3); // 0x1.6c16c1852b805p-10\n r = fma (r, f, 8.3333333334557735e-3); // 0x1.11111111224c7p-7\n r = fma (r, f, 4.1666666666519782e-2); // 0x1.55555555502a5p-5\n r = fma (r, f, 1.6666666666666477e-1); // 0x1.5555555555511p-3\n r = fma (r, f, 5.0000000000000122e-1); // 0x1.000000000000bp-1\n r = fma (r, f, 1.0000000000000000e+0); // 0x1.0000000000000p+0\n r = fma (r, f, 1.0000000000000000e+0); // 0x1.0000000000000p+0\n\n /* exp(a) = 2**(i+scale) * r */\n r = hiloint2double (double2hiint (r) + ((i + scale) << 20), \n double2loint (r));\n return r;\n}\n\n/* compute natural logarithm of positive normals; max. err. found: 0.86902 ulp*/\ndouble log_pos_normal (double a)\n{\n const double ln2_hi = 6.9314718055994529e-01; // 0x1.62e42fefa39efp-01\n const double ln2_lo = 2.3190468138462996e-17; // 0x1.abc9e3b39803fp-56\n double m, r, i, s, t, p, q;\n int e;\n\n /* log(a) = log(m * 2**i) = i * log(2) + log(m) */\n e = (double2hiint (a) - double2hiint (0.70703125)) & 0xfff00000;\n m = hiloint2double (double2hiint (a) - e, double2loint (a));\n t = hiloint2double (0x41f00000, 0x80000000 ^ e);\n i = t - (hiloint2double (0x41f00000, 0x80000000));\n\n /* m now in [181/256, 362/256]. Compute q = (m-1) / (m+1) */\n p = m + 1.0;\n r = 1.0 / p;\n q = fma (m, r, -r);\n m = m - 1.0;\n\n /* compute (2*atanh(q)/q-2*q) as p(q**2), q in [-75/437, 53/309] */\n s = q * q;\n r = 1.4794533702196025e-1; // 0x1.2efdf700d7135p-3\n r = fma (r, s, 1.5314187748152339e-1); // 0x1.39a272db730f7p-3\n r = fma (r, s, 1.8183559141306990e-1); // 0x1.746637f2f191bp-3\n r = fma (r, s, 2.2222198669309609e-1); // 0x1.c71c522a64577p-3\n r = fma (r, s, 2.8571428741489319e-1); // 0x1.24924941c9a2fp-2\n r = fma (r, s, 3.9999999999418523e-1); // 0x1.999999998006cp-2\n r = fma (r, s, 6.6666666666667340e-1); // 0x1.5555555555592p-1\n r = r * s;\n\n /* log(a) = 2*atanh(q) + i*log(2) = ln2_lo*i + p(q**2)*q + 2q + ln2_hi * i.\n Use K.C. Ng\'s trick to improve the accuracy of the computation, like so:\n p(q**2)*q + 2q = p(q**2)*q + q*t - t + m, where t = m**2/2.\n */\n t = m * m * 0.5;\n r = fma (q, t, fma (q, r, ln2_lo * i)) - t + m;\n r = fma (ln2_hi, i, r);\n\n return r;\n}\nRun Code Online (Sandbox Code Playgroud)\n