相关疑难解决方法(0)

在C中以整数查找最高设置位(msb)的最快/最有效方法是什么？

如果我有一个整数n,并且我想知道最高位的位置(也就是说,如果最低有效位在右边,我想知道最左边位的位置是1),找出最快捷/最有效的方法是什么？

我知道POSIX支持ffs()strings.h中的一个方法来查找第一个设置位,但似乎没有相应的fls()方法.

是否有一些非常明显的方法可以解决这个问题？

如果你不能使用POSIX功能来实现可移植性呢？

编辑:如何在32位和64位架构上运行的解决方案(许多代码清单似乎只能在32位整数上运行).

c algorithm optimization bit-manipulation

Zxa*_*aos

2017 03-04

112
推荐指数

11
解决办法

11万
查看次数

找到C中的最高位

我所追求的是我可以输入一个数字的东西,它将返回最高位.我确信这有一个简单的方法.下面是一个示例输出(左边是输入)

1 -> 1
2 -> 2
3 -> 2
4 -> 4
5 -> 4
6 -> 4
7 -> 4
8 -> 8
9 -> 8
...
63 -> 32

Har*_*mbe

lucky-day

42
推荐指数

8
解决办法

5万
查看次数

为什么破坏LZCNT的"输出依赖性"很重要？

在测量某些东西的同时,我测量的吞吐量比我计算的要低得多,我将其缩小到LZCNT指令(它也发生在TZCNT中),如以下基准所示:

  xor ecx, ecx
_benchloop:
  lzcnt eax, edx
  add ecx, 1
  jnz _benchloop

Run Code Online (Sandbox Code Playgroud)

和:

  xor ecx, ecx
_benchloop:
  xor eax, eax  ; this shouldn't help, but it does
  lzcnt eax, edx
  add ecx, 1
  jnz _benchloop

Run Code Online (Sandbox Code Playgroud)

第二个版本要快得多.它不应该.LZCNT没有理由对其输出有输入依赖性.与BSR/BSF不同,xZCNT指令总是覆盖其输出.

我在4770K上运行它,所以LZCNT和TZCNT没有被执行为BSR/BSF.

这里发生了什么？

x86 assembly

har*_*old

lucky-day

22
推荐指数

1
解决办法

1339
查看次数

如何有效地计算小于或等于给定数字的2的最高功率？

到目前为止我想出了三个解决方案:

极低效的标准库pow和log2功能:

int_fast16_t powlog(uint_fast16_t n)
{
  return static_cast<uint_fast16_t>(pow(2, floor(log2(n))));
}

Run Code Online (Sandbox Code Playgroud)

计算后续2次幂的效率要高得多,直到我达到的数量超过我必须达到的数量:

uint_fast16_t multiply(uint_fast16_t n)
{
  uint_fast16_t maxpow = 1;
  while(2*maxpow <= n)
    maxpow *= 2;
  return maxpow;
}

Run Code Online (Sandbox Code Playgroud)

到目前为止最有效的binsearching预先计算的2的权力表:

uint_fast16_t binsearch(uint_fast16_t n)
{
  static array<uint_fast16_t, 20> pows {1,2,4,8,16,32,64,128,256,512,
    1024,2048,4096,8192,16384,32768,65536,131072,262144,524288};

  return *(upper_bound(pows.begin(), pows.end(), n)-1);
}

Run Code Online (Sandbox Code Playgroud)

这可以进一步优化吗？可以在这里使用的任何技巧？

我使用的完整基准:

#include <iostream>
#include <chrono>
#include <cmath>
#include <cstdint>
#include <array>
#include <algorithm>
using namespace std;
using namespace chrono;

uint_fast16_t powlog(uint_fast16_t n)
{
  return static_cast<uint_fast16_t>(pow(2, floor(log2(n))));
}

uint_fast16_t multiply(uint_fast16_t n)
{ …

Run Code Online (Sandbox Code Playgroud)

c++ optimization

gaa*_*kam

lucky-day

12
推荐指数

1
解决办法

1417
查看次数