ano*_*nol 6 c floating-point trigonometry
该POSIX页面中的tan函数族(tan,tanf,tanl)用C说:
如果正确的值会导致溢出,则会发生范围错误,并且 tan()、tanf() 和 tanl() 应分别返回 ±HUGE_VAL、±HUGE_VALF 和 ±HUGE_VALL,其符号与正确的值相同功能。
然而,在实践中是非常困难的实际获得infinity/-infinity在这种情况下,由于浮点数将需要足够接近π/ 2,使得其切线会比该类型的表示的最大浮点值较大.
根据经验,我无法使用标准 glibc 获得这样的结果,即使使用这些nextafter函数来获得最接近 ?/2 的可能值(atan(1) * 2用作“一半?”,并从那里开始,无论是左侧还是右侧)。
对所有(32 位)floats 的详尽测试证实这对于 32 位是正确的,至少在我的库中是这样。测试doubles 和long doubles的 ?/2 附近表明这也是他们的情况。然而,详尽的测试有点太长了:我需要测试 (2k+1)•?/2 的附近,对于 ?k ? ?.
那么,是否有一些数学或实践论证可以让人们得出结论,至少在“合理正确”的库实现中(例如,使用一些测量的 ULP 边界,就像GNU C 库数学函数所做的那样),浮点精度 -点值将始终tan(x)适合这些值的有限表示?换句话说,它tan向无穷大的增长速度不会比我们接近 ?/2 的速度快吗?
请注意,我不包括tan(NAN)与tan(INFINITY)从讨论,因为这些记录的角落案件返回NaN的。此外,使用次正规数可能会获得不同的结果,但由于它们只出现在零附近而不是 ?/2 附近,我相信我们可以排除它们。
因此,我正在寻找一些数学论证/证明/详尽的测试来表明这不会发生,或者只是一个反例,它使用任何标准的 C 库实现,这样包含<math.h>和调用tan就可以做到;但不包括具有非标准tan类函数的特定库。
这是一个实际演示,表明tan(x)对于任何有限的 IEEE 754 binary64 double ,正确舍入永远不会是无限的x;事实上,它永远不可能超过2.2e+18。同样的方法也适用于 64 位精度long double类型,但搜索时间会更长一些。它甚至对于 IEEE 754 二进制 128 和二进制 256 浮点格式应该是可行的。除此之外,启发式论证表明任何 IEEE 754 二进制浮点格式都不太可能溢出,但我不知道任何证据,并且我怀疑不存在已知的证据。tan(x)
显然,仅考虑正有限浮点数就足够了(此处和下文中,“float”表示 IEEE 754 二进制 64 浮点值)。m * 2^e每个这样的浮点数都有一些整数的形式m,并且e带有0 < m < 2^53和-1074 <= e <= 971.
So we want to find integers n, m and e minimising the absolute value |n \xcf\x80/2 - m 2^e|, subject to constraints 0 < n, 0 < m < 2^53 and -1074 <= e <= 971. We\'re actually only interested in odd values of n, but the optimal n will already be odd: if it\'s even, then we can halve n and decrement e to halve the absolute value. Moreover, it\'s clear that there\'s little point examining e smaller than -53, so we can further restrict to -53 <= e <= 971.
For any particular fixed value of e, our problem is equivalent to finding positive integers m and n that minimise the absolute value |n - m 2^(e+1)/\xcf\x80|, subject to the constraint m < 2^53. If we can solve this problem efficiently, we can iterate over the 1025 distinct values of e to pull out the overall best solution.
But the subproblem for fixed exponent e can be solved by finding the continued-fraction expansion of 2^(e+1) / \xcf\x80: standard number-theoretic results state that the minimum of |n - m 2^(e+1) / \xcf\x80| is guaranteed to come from either a convergent or a semi-convergent n / m to 2^(e+1) / \xcf\x80. All we have to do is find the best convergent or semi-convergent whose denominator is strictly smaller than 2^53. In practice, the continued fraction expansion allows us to find both the best convergent or semi-convergent (subject to the constraint the the denominator is less than 2^53) that\'s strictly greater than |2^(e+1) / \xcf\x80|, and separately the best convergent or semi-convergent (with similarly constrained denominator) that\'s strictly smaller than |2^(e+1) / \xcf\x80|, and then a comparison is needed to determine which of the two convergents is closer.
Here\'s some Python code that implements the above, and ends up finding the same best approximation already mentioned in the comments. First, a handful of imports that we\'ll need later on:
\n\nfrom fractions import Fraction\nfrom math import floor, ldexp\nRun Code Online (Sandbox Code Playgroud)\nWe\'ll need an approximation for \xcf\x80/2. In fact, we\'ll need a pair of approximations: a lower and an upper bound. We\'ll use the first 1000 digits of \xcf\x80, since they\'re well documented and easy to find online (though it turns out that in practice we actually only need around 330 digits). With those digits, we create Fraction instances HALF_PI_LOWER and HALF_PI_UPPER representing tight lower and upper bounds for \xcf\x80/2:
# Approximation to \xcf\x80, given by the first 1000 digits after the point. (source:\n# https://mathshistory.st-andrews.ac.uk/HistTopics/1000_places/)\nPI_APPROX = Fraction("".join("""\n3.14159265358979323846264338327950288419716939937510\n 58209749445923078164062862089986280348253421170679\n 82148086513282306647093844609550582231725359408128\n 48111745028410270193852110555964462294895493038196\n 44288109756659334461284756482337867831652712019091\n 45648566923460348610454326648213393607260249141273\n 72458700660631558817488152092096282925409171536436\n 78925903600113305305488204665213841469519415116094\n 33057270365759591953092186117381932611793105118548\n 07446237996274956735188575272489122793818301194912\n 98336733624406566430860213949463952247371907021798\n 60943702770539217176293176752384674818467669405132\n 00056812714526356082778577134275778960917363717872\n 14684409012249534301465495853710507922796892589235\n 42019956112129021960864034418159813629774771309960\n 51870721134999999837297804995105973173281609631859\n 50244594553469083026425223082533446850352619311881\n 71010003137838752886587533208381420617177669147303\n 59825349042875546873115956286388235378759375195778\n 18577805321712268066130019278766111959092164201989\n""".split()))\n\n# True value of \xcf\x80 is within ERROR of the value given above.\nERROR = Fraction("1e-1000")\n\nHALF_PI_LOWER = (PI_APPROX - ERROR) / 2\nHALF_PI_UPPER = (PI_APPROX + ERROR) / 2\nRun Code Online (Sandbox Code Playgroud)\n现在为续分数机械。首先,我们给出一个生成器函数,当提供某个值的下限和上限时,该函数生成该值的连分数系数序列。本质上,这会分别计算下限和上限的系数,当这些系数匹配时,它们会被传递给调用者。一旦它们不匹配,我们就没有足够的信息来确定下一个系数,并且该函数将在该点引发异常( 或RuntimeError)ZeroDivisionError。我们不太关心引发哪个异常:我们依赖于我们的原始近似值足够准确,以至于我们永远不会遇到该异常。
def continued_fraction_coefficients(lower: Fraction, upper: Fraction):\n """\n Generate continued fraction coefficients for a positive number.\n\n Given rational lower and upper bounds for a postive rational or irrational\n number, generate the known coefficients of the continued fraction\n representation of that number. The tighter the lower and upper bounds, the\n more coefficients will be known.\n\n Raises RuntimeError or ZeroDivisionError when there\'s insufficient\n information to determine the next coefficient.\n """\n while (q := floor(lower)) == floor(upper):\n yield q\n lower, upper = 1/(upper - q), 1/(lower - q)\n raise RuntimeError("Insufficient resolution")\nRun Code Online (Sandbox Code Playgroud)\n接下来,这是一个函数,它使用这一系列连分数系数来生成相应值的最佳下限和最佳上限近似值(对于分母的给定限制)。它从连分数序列生成连续收敛,并且在超过分母限制后,它回溯以找到适合分母限制的最佳界限。
\ndef best_bounds(coefficients, max_denominator):\n """\n Find best lower and upper bounds under a given denominator bound.\n\n *coefficients* is the sequence of continued fraction coefficients\n for the value we\'re approximating.\n\n *max_denominator* is the limit on the denominator.\n\n Returns the best lower approximation and the best upper approximation\n to the value being approximated, for the given limit on the denominator.\n """\n # a/b and c/d are the closest convergents so far\n # sign is 1 if a/b < value < c/d, and -1 otherwise.\n\n a, b, c, d, sign = 0, 1, 1, 0, 1\n for q in coefficients:\n a, b, c, d, sign = c, d, a + q * c, b + q * d, -sign\n if max_denominator < d:\n correction = (max_denominator - d) // b\n c, d = c + correction * a, d + correction * b\n break\n\n if sign == 1:\n return Fraction(a, b), Fraction(c, d)\n else:\n return Fraction(c, d), Fraction(a, b)\nRun Code Online (Sandbox Code Playgroud)\n现在我们编写一个专用函数,对于固定指数,最小e化|n \xcf\x80/2 - m 2^e|所有n和。如果返回两个三元组,其中是浮点值,表示为 Python ,并给出近似值中的误差,也转换为浮点数。(因此这些值并不完全精确,但它们足够接近,足以让我们找到最佳的整体近似值。)这两个三元组代表最佳的下限和上限。mm < 2^53(x, n, error)xm 2^efloaterrorerror
def best_for_given_exponent(e):\n """\n Find best lower and upper approximations for 2^e / (\xcf\x80/2)\n with denominator smaller than 2**53.\n\n Returns two triples of the form (x, n, error), where:\n\n - x is a float close to an integer multiple of \xcf\x80/2\n - n is the coefficient of that integer multiple of \xcf\x80/2\n - error is the absolute error |x - n \xcf\x80/2|, rounded to the nearest float.\n\n The first triple gives a best lower approximation for a multiple of \xcf\x80/2,\n while the second triple gives a best upper approximation.\n """\n twopow = Fraction(2)**e\n coefficients = continued_fraction_coefficients(twopow/HALF_PI_UPPER, twopow/HALF_PI_LOWER)\n best_lower, best_upper = best_bounds(coefficients, max_denominator=2**53 - 1)\n\n n, d = best_lower.as_integer_ratio()\n lower_result = ldexp(d, e), n, float(d * twopow - n * HALF_PI_LOWER)\n\n n, d = best_upper.as_integer_ratio()\n upper_result = ldexp(d, e), n, float(n * HALF_PI_UPPER - d * twopow)\n\n return lower_result, upper_result\nRun Code Online (Sandbox Code Playgroud)\n最后,我们将所有这些放在一起,以找到 的奇数倍的总体最佳二进制 64 近似值\xcf\x80/2。
all_results = [\n result for e in range(-53, 972)\n for result in best_for_given_exponent(e)\n]\nx, n, error = min(all_results, key=lambda result: result[2])\n\nprint(f"Best approximation: {x.hex()} is an approximation to {n}*\xcf\x80/2 with approximate error {error}")\nRun Code Online (Sandbox Code Playgroud)\n当我在我的机器上运行它时,它会给出以下输出(经过大约 5 秒的计算):
\nBest approximation: 0x1.6ac5b262ca1ffp+849 is an approximation to 3386417804515981120643892082331156599120239393299838035242121518428537554064774221620930267583474709602068045686026362989271814411863708499869721322715946622634302011697632972907922558892710830616034038541342154669787134871905353772776431251615694251273653*\xcf\x80/2 with approximate error 4.687165924254628e-19\nRun Code Online (Sandbox Code Playgroud)\n因此,总体最坏的情况是浮点数,它与 Eric Postpischil 在对原始问题的评论中引用的0x1.6ac5b262ca1ffp+849值相匹配:6381956970095103\xe2\x80\xa22^797
>>> float.fromhex(\'0x1.6ac5b262ca1ffp+849\') == ldexp(6381956970095103, 797)\nTrue\nRun Code Online (Sandbox Code Playgroud)\n该值的正切值大致为\xc2\xb11/4.687165924254628e-19, 或2.133485385753704e+18,这实际上是我机器上的数学库生成的值:
>>> from math import tan\n>>> tan(float.fromhex(\'0x1.6ac5b262ca1ffp+849\'))\n-2.133485385753704e+18\nRun Code Online (Sandbox Code Playgroud)\n因此,没有任何有限的二进制 64 浮点数的正切值超过2.133485385753704e+18。