基于汉明重量的索引

1--*_*--1 11 c sorting indexing cuda hammingweight

假设我们有一个整数bitsize n=4;
我正在描述的问题是如何根据汉明权重及其知道的值来将数字索引到数组位置bitsize.例如,具有16个用于bitsize 4的元素的数组将/可能如下所示:

|0|1|2|4|8|3|5|6|9|10|12|7|11|13|14|15|
Run Code Online (Sandbox Code Playgroud)

元素按其汉明重量(必要)分组,并根据大小排序(不必要).只要您可以采取例如3(0011)进行一些操作并返回索引5,5(0101) - > 6等,则不需要排序.

n将存在所有位组合,并且不会重复.例如bitsize 3将有数组:

|0|1|2|4|3|5|6|7|
Run Code Online (Sandbox Code Playgroud)

我最好有一个没有循环的解决方案.或任何讨论simillar解决方案的论文.或者最后只是抛出任何关于如何做到这一点的想法.

小智 13

请注意,您可以使用以下函数枚举具有相同汉明重量的数字(按计数顺序):

int next(int n) { // get the next one with same # of bits set
  int lo = n & -n;       // lowest one bit
  int lz = (n + lo) & ~n;      // lowest zero bit above lo
  n |= lz;                     // add lz to the set
  n &= ~(lz - 1);              // reset bits below lz
  n |= (lz / lo / 2) - 1;      // put back right number of bits at end
  return n;
}

int prev(int n) { // get the prev one with same # of bits set
   int y = ~n;
   y &= -y; // lowest zero bit
   n &= ~(y-1); // reset all bits below y
   int z = n & -n; // lowest set bit
   n &= ~z;        // clear z bit
   n |= (z - z / (2*y)); // add requried number of bits below z
   return n;
 }
Run Code Online (Sandbox Code Playgroud)

例如,在x = 5678上重复应用prev():

0: 00000001011000101110 (5678)
1: 00000001011000101101 (5677)
2: 00000001011000101011 (5675)
3: 00000001011000100111 (5671)
4: 00000001011000011110 (5662)
5: 00000001011000011101 (5661)
6: 00000001011000011011 (5659)
.....
Run Code Online (Sandbox Code Playgroud)

因此,从理论上讲,您可以通过重复应用来计算数字的索引.然而,这可能需要很长时间.更好的方法是"跳过"某些组合.

有两个规则:

 1. if the number starts with: ..XXX10..01..1 we can replace it by ..XXX0..01..1
adding corresponding number of combinations
 2. if the number starts with: ..XXX1..10..0 again replace it by XXX0..01..1 with corresponding number of combinations 
Run Code Online (Sandbox Code Playgroud)

以下算法计算具有相同汉明权重的数字中的数字的索引(我不打扰快速实现二项式):

#define LOG2(x) (__builtin_ffs(x)-1)

int C(int n, int k) { // simple implementation of binomial
 int c = n - k; 
 if(k < c) 
   std::swap(k,c);
 if(c == 0)
  return 1;
 if(k == n-1) 
  return n;
 int b = k+1;
 for(int i = k+2; i <= n; i++) 
    b = b*i;
 for(int i = 2; i <= c; i++)
   b = b / i;
 return b;
}
int position_jumping(unsigned x) {
   int index = 0;
  while(1) {

    if(x & 1) { // rule 1: x is of the form: ..XXX10..01..1
        unsigned y = ~x;
        unsigned lo = y & -y; // lowest zero bit
        unsigned xz = x & ~(lo-1); // reset all bits below lo
        unsigned lz = xz & -xz; // lowest one bit after lo
        if(lz == 0) // we are in the first position!
           return index;

        int nn = LOG2(lz), kk = LOG2(lo)+1;       
        index += C(nn, kk); //   C(n-1,k) where n = log lz and k = log lo + 1

        x &= ~lz; //! clear lz bit
        x |= lo; //! add lo

    } else { // rule 2: x is of the form: ..XXX1..10..0
        int lo = x & -x; // lowest set bit
        int lz = (x + lo) & ~x;  // lowest zero bit above lo  
        x &= ~(lz-1); // clear all bits below lz
        int sh = lz / lo;

        if(lz == 0) // special case meaning that lo is in the last position
            sh=((1<<31) / lo)*2;
        x |= sh-1;

        int nn = LOG2(lz), kk = LOG2(sh);
        if(nn == 0)
           nn = 32;
        index += C(nn, kk);
    }
    std::cout << "x: " << std::bitset<20>(x).to_string() << "; pos: " << index << "\n";
  }
 }
Run Code Online (Sandbox Code Playgroud)

例如,给定数字x = 5678,算法将仅在4次迭代中计算其索引:

  x: 00000001011000100111; pos: 4
  x: 00000001011000001111; pos: 9
  x: 00000001010000011111; pos: 135
  x: 00000001000000111111; pos: 345
  x: 00000000000001111111; pos: 1137
Run Code Online (Sandbox Code Playgroud)

注意,1137是具有相同汉明权重的数字组中的5678的位置.因此,您必须相应地移动此索引以考虑具有较小汉明权重的所有数字