float a[4] = {1,2,3,4}, b[4] = {4,3,2,1};
uint32_t c[4];
int main() {
__m128 pa = _mm_loadu_ps(a);
__m128 pb = _mm_loadu_ps(b);
__m128 pc = _mm_cmpgt_ps(pa, pb);
_mm_storeu_ps((float*)c, pc);
for (int i = 0;i < 4; ++i) printf("%u\n", c[i]);
return 0;
}
Run Code Online (Sandbox Code Playgroud)
什么是正确的指示_mm_storeu_ps((float*)c, pc)?在这里,c是一个整数数组......我不认为这种方式是好的,还是更好吗?
小智 7
在SSE2中有两个指令将__m128(float向量)转换为__m128i(int32_t向量):( _mm_cvtps_epi32带有舍入)和_mm_cvttps_epi32(带截断).
__m128i vi = _mm_cvttps_epi32(pc);
_mm_storeu_si128((__m128i *)c, vi);
Run Code Online (Sandbox Code Playgroud)
如果你不能用SSE2,你应该转换float阵列int存储阵列后pc进入float阵列.
float d[4];
_mm_storeu_ps(d, pc);
c[0] = (int)d[0]; c[1] = (int)d[1]; c[2] = (int)d[2]; c[3] = (int)d[3];
Run Code Online (Sandbox Code Playgroud)