通过固定常量执行整数硬件除法的最快方法是什么？

Question

通过固定常量执行整数硬件除法的最快方法是什么？

gef*_*eft 9 algorithm division system-verilog

我有一个16位数,我想要除以100.假设它是50000.目标是获得500.但是,我试图避免推断我的FPGA上的分频器,因为它们打破了时序要求.结果不一定准确; 近似值可以.

我尝试过硬件乘法0.01但不支持实数.我现在正在看流水线分频器,但我希望它不会那样.

Answer 1

Dou*_*rie 12

从概念上讲:乘以655(= 65536/100)然后向右移16位.当然,在硬件方面,右移是免费的.

如果你需要它更快,你可以将除以两个幂(移位)的除数之和.例如,

1/100 ~= 1/128                  = 0.0078125
1/100 ~= 1/128 + 1/256          = 0.01171875
1/100 ~= 1/128 + 1/512          = 0.009765625
1/100 ~= 1/128 + 1/512 + 1/2048 = 0.01025390625
1/100 ~= 1/128 + 1/512 + 1/4096 = 0.010009765625
etc.

Run Code Online (Sandbox Code Playgroud)

在C代码中,上面的最后一个例子是:

uint16_t divideBy100 (uint16_t input)
{
    return (input >> 7) + (input >> 9) + (input >> 12);
}

Run Code Online (Sandbox Code Playgroud)

由于65536/100不是一个精确值，因此您需要进行一些分析以确保结果在您的误差范围内。更多的位可能有所帮助，或者您可以减少花费。 (2认同)
@Morgan实际上先划分然后添加是绝对不准确的 - 对于输入范围0-65535,36016关闭1(两个方向)和5580关闭2!如果将表达式更改为`((n << 5)+(n << 3)+ n)>> 12`,则会错误地将20640关闭1个错误,始终为高.它相当于`n*41/4096`. (2认同)

Answer 2

use*_*109 5

假如说

整数除法旨在截断而不是舍入（例如 599 / 100 = 5）
FPGA 中可以有一个 16x16 乘法器（一个输入具有固定值）

那么你可以通过实现一个 16x16 无符号乘法器来获得精确值，其中一个输入是 0xA3D7，另一个输入是你的 16 位数字。将 0x8000 添加到 32 位乘积中，结果位于高 10 位中。

在 C 代码中，算法如下所示

uint16_t divideBy100( uint16_t input )
{
    uint32_t temp;

    temp = input;
    temp *= 0xA3D7;     // compute the 32-bit product of two 16-bit unsigned numbers
    temp += 0x8000;     // adjust the 32-bit product since 0xA3D7 is actually a little low
    temp >>= 22;        // the upper 10-bits are the answer

    return( (uint16_t)temp );
}

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年，7 月前
查看次数：	6042 次
最近记录：	11 年，7 月前