我想编写一个 ac 程序来计算某个范围内的字节数a...c使用以下代码:
char a[16], b[16], c[16];
int counter = 0;
for(i = 0; i < 16; i++)
{
if((a[i] < b[i]) && (b[i] < c[i]))
counter++;
}
return counter;
Run Code Online (Sandbox Code Playgroud)
我打算做这样的事情
__m128i result1 = _mm_cmpgt_epi8 (b, a);
__m128i result2 = _mm_cmplt_epi8 (b, c);
unsigned short out1 = _mm_movemask_epi8(result1);
unsigned short out2 = _mm_movemask_epi8(result2);
unsigned short out3 = out1 & out2;
unsigned short out4 = _mm_popcnt_u32(out3);
Run Code Online (Sandbox Code Playgroud)
我的方法正确吗?有更好的方法吗?
你的做法看起来很合理。我认为您可以通过在 SIMD 寄存器内执行 AND 操作来保存指令,如下所示:
__m128i result1 = _mm_cmpgt_epi8 (b, a);
__m128i result2 = _mm_cmplt_epi8 (b, c);
__m128i mask = _mm_and_si128(result1, result2);
int mask2 = _mm_movemask_epi8(mask);
int counter = _mm_popcnt_u32(mask2);
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
379 次 |
| 最近记录: |