如何通过按位运算尽可能精确地计算C中整数的log2

Question

如何通过按位运算尽可能精确地计算C中整数的log2

asc*_*cub 5 c bit-manipulation entropy bitwise-operators

我需要计算熵，并且由于我的系统的限制，我需要使用受限的 C 功能（无循环，无浮点支持），并且我需要尽可能高的精度。从这里我弄清楚如何使用按位运算来估计整数的下限 log2。尽管如此，我需要提高结果的精度。由于不允许浮点运算，有没有什么方法可以进行计算，log2(x/y)使x < y结果类似于log2(x/y)*10000，旨在通过算术整数获得我需要的精度？

Answer 1

thb*_*thb 4

您将基于以下公式制定算法

log2(x/y) = K*(-log(x/y));

Run Code Online (Sandbox Code Playgroud)

在哪里

 K        = -1.0/log(2.0); // you can precompute this constant before run-time
 a        = (y-x)/y;
-log(x/y) = a + a^2/2 + a^3/3 + a^4/4 + a^5/5 + ...

Run Code Online (Sandbox Code Playgroud)

如果你正确地编写了循环，或者，如果你愿意，展开循环以无循环地编码相同的操作序列，那么你可以处理整数运算中的所有内容：

(y^N*(1*2*3*4*5*...*N)) * (-log(x/y))
  = y^(N-1)*(2*3*4*5*...*N)*(y-x) + y^(N-2)*(1*3*4*5*...*N)*(y-x)^2 + ...

Run Code Online (Sandbox Code Playgroud)

当然^，绑定比更紧密的幂运算符*不是 C 运算符，但您可以在（可能展开的）循环上下文中有效地实现它作为运行产品。

这N是一个足够大的整数，足以提供所需的精度，但又不会大到超出可用的位数。如果不确定，请尝试N = 6举例。关于K，您可能会反对它是一个浮点数，但这对您来说不是问题，因为您将预先计算K，将其存储为整数的比率。

示例代码

这是一个玩具代码，但它适用于较小的值，x例如y5 和 7，因此足以证明这个概念。在玩具代码中，较大的值可能会悄悄溢出默认的 64 位寄存器。需要做更多的工作来使代码变得健壮。

#include <stddef.h>
#include <stdlib.h>
// Your program will not need the below headers, which are here
// included only for comparison and demonstration.
#include <math.h>
#include <stdio.h>

const size_t     N = 6;
const long long Ky = 1 << 10; // denominator of K
// Your code should define a precomputed value for Kx here.

int main(const int argc, const char *const *const argv)
{
    // Your program won't include the following library calls but this
    // does not matter.  You can instead precompute the value of Kx and
    // hard-code its value above with Ky.
    const long long Kx = lrintl((-1.0/log(2.0))*Ky); // numerator of K
    printf("K == %lld/%lld\n", Kx, Ky);

    if (argc != 3) exit(1);

    // Read x and y from the command line.
    const long long x0 = atoll(argv[1]);
    const long long y  = atoll(argv[2]);
    printf("x/y == %lld/%lld\n", x0, y);
    if (x0 <= 0 || y <= 0 || x0 > y) exit(1);

    // If 2*x <= y, then, to improve accuracy, double x repeatedly
    // until 2*x > y. Each doubling offsets the log2 by 1. The offset
    // is to be recovered later.
    long long               x = x0;
    int integral_part_of_log2 = 0;
    while (1) {
        const long long trial_x = x << 1;
        if (trial_x > y) break;
        x = trial_x;
        --integral_part_of_log2;
    }
    printf("integral_part_of_log2 == %d\n", integral_part_of_log2);

    // Calculate the denominator of -log(x/y).
    long long yy = 1;
    for (size_t j = N; j; --j) yy *= j*y;

    // Calculate the numerator of -log(x/y).
    long long xx = 0;
    {
        const long long y_minus_x = y - x;
        for (size_t i = N; i; --i) {
            long long term = 1;
            size_t j       = N;
            for (; j > i; --j) {
                term *= j*y;
            }
            term *= y_minus_x;
            --j;
            for (; j; --j) {
                term *= j*y_minus_x;
            }
            xx += term;
        }
    }

    // Convert log to log2.
    xx *= Kx;
    yy *= Ky;

    // Restore the aforementioned offset.
    for (; integral_part_of_log2; ++integral_part_of_log2) xx -= yy;

    printf("log2(%lld/%lld) == %lld/%lld\n", x0, y, xx, yy);
    printf("in floating point, this ratio of integers works out to %g\n",
      (1.0*xx)/(1.0*yy));
    printf("the CPU's floating-point unit computes the log2 to be  %g\n",
      log2((1.0*x0)/(1.0*y)));

    return 0;
}

Run Code Online (Sandbox Code Playgroud)

使用命令行参数在我的机器上运行它5 7，它输出：

K == -1477/1024
x/y == 5/7
integral_part_of_log2 == 0
log2(5/7) == -42093223872/86740254720
in floating point, this ratio of integers works out to -0.485279
the CPU's floating-point unit computes the log2 to be  -0.485427

Run Code Online (Sandbox Code Playgroud)

N = 12和会大大提高准确性Ky = 1 << 20，但为此您需要更简洁的代码或超过 64 位。

节俭代码

需要更多精力编写的 Thriftier 代码可能代表质因数中的分子和分母。例如，它可能将 500 表示为 [2 0 3]，即 (2 ² )(3 ⁰ )(5 ³ )。

然而，你的想象力可能会进一步提高。

另一种方法

对于另一种方法，虽然它可能无法完全满足您的要求，但如果您的程序是我的，@phuclv 给出了我倾向于遵循的建议：逆向解决问题，猜测c/d对数的值并然后计算2^(c/d)，大概是通过牛顿-拉夫森迭代。就我个人而言，我更喜欢牛顿-拉夫森方法。见教派。4.8在这里（我原来的）。

数学背景

包括我在内的几个来源已经链接解释了第一种方法的泰勒级数和第二种方法的牛顿-拉夫森迭代。不幸的是，数学并不简单，但你已经明白了。祝你好运。

归档时间：	6 年，11 月前
查看次数：	1255 次
最近记录：	6 年，11 月前