har*_*old 16
此操作称为压缩权限.它是作为的一部分BMI2为PEXT指导,以英特尔处理器为Haswell的的.
不幸的是,没有硬件支持是一个非常烦人的操作.当然有一个明显的解决方案,只需在循环中逐个移动位,这是Hackers Delight给出的:
unsigned compress(unsigned x, unsigned m) {
unsigned r, s, b; // Result, shift, mask bit.
r = 0;
s = 0;
do {
b = m & 1;
r = r | ((x & b) << s);
s = s + b;
x = x >> 1;
m = m >> 1;
} while (m != 0);
return r;
}
Run Code Online (Sandbox Code Playgroud)
但还有另一种方法,也是由Hackers Delight给出的,它的循环次数较少(位数的迭代对数),但每次迭代次数更多:
unsigned compress(unsigned x, unsigned m) {
unsigned mk, mp, mv, t;
int i;
x = x & m; // Clear irrelevant bits.
mk = ~m << 1; // We will count 0's to right.
for (i = 0; i < 5; i++) {
mp = mk ^ (mk << 1); // Parallel prefix.
mp = mp ^ (mp << 2);
mp = mp ^ (mp << 4);
mp = mp ^ (mp << 8);
mp = mp ^ (mp << 16);
mv = mp & m; // Bits to move.
m = m ^ mv | (mv >> (1 << i)); // Compress m.
t = x & mv;
x = x ^ t | (t >> (1 << i)); // Compress x.
mk = mk & ~mp;
}
return x;
}
Run Code Online (Sandbox Code Playgroud)
请注意,那里的许多值仅依赖于m.由于您只有512个不同的掩码,您可以预先计算它们并将代码简化为这样的(未经测试)
unsigned compress(unsigned x, int maskindex) {
unsigned t;
int i;
x = x & masks[maskindex][0];
for (i = 0; i < 5; i++) {
t = x & masks[maskindex][i + 1];
x = x ^ t | (t >> (1 << i));
}
return x;
}
Run Code Online (Sandbox Code Playgroud)
当然,所有这些都可以通过展开变成"非循环",第二种和第三种方式可能更适合于此.然而,这有点作弊.
| 归档时间: |
|
| 查看次数: |
1730 次 |
| 最近记录: |