jot*_*tik 7 c++ x86 x86-64 simd intrinsics
I need a way to compare values of type __m128i in C++ for a total order between any values of type __m128i. The type of order doesn't matter as long as it establishes a total order between all values of type __m128i. Hence the comparison might be less-than between 128-bit integers or something else entirely as long as is provides a total order.
I tried using the < operator, but that didn't return a bool, but instead seems to compare the vector components of __m128i (i.e. SIMD):
#include <emmintrin.h>
inline bool isLessThan(__m128i a, __m128i b) noexcept {
// error: cannot convert '__vector(2) long int' to 'bool' in return
return a < b;
}
Run Code Online (Sandbox Code Playgroud)
Another possibility would be to use memcmp/strcmp or similar, but this would very likely not be optimal. Targeting modern Intel x86-64 CPUs with at least SSE4.2 and AVX2, are there any intrinsics / instructions I could use for such comparisons? How to do it?
PS:已经问过类似的问题来检查相等性,而不是排序:
干得好。
inline bool isLessThan( __m128i a, __m128i b )
{
/* Compare 8-bit lanes for ( a < b ), store the bits in the low 16 bits of the
scalar value: */
const int less = _mm_movemask_epi8( _mm_cmplt_epi8( a, b ) );
/* Compare 8-bit lanes for ( a > b ), store the bits in the low 16 bits of the
scalar value: */
const int greater = _mm_movemask_epi8( _mm_cmpgt_epi8( a, b ) );
/* It's counter-intuitive, but this scalar comparison does the right thing.
Essentially, integer comparison searches for the most significant bit that
differs... */
return less > greater;
}
Run Code Online (Sandbox Code Playgroud)
顺序不理想,因为coz pcmpgtb将这些字节视为带符号的整数,但是您说这对用例并不重要。
更新:这是uint128_t排序顺序稍慢的版本。
// True if a < b, for unsigned 128 bit integers
inline bool cmplt_u128( __m128i a, __m128i b )
{
// Flip the sign bits in both arguments.
// Transforms 0 into -128 = minimum for signed bytes,
// 0xFF into +127 = maximum for signed bytes
const __m128i signBits = _mm_set1_epi8( (char)0x80 );
a = _mm_xor_si128( a, signBits );
b = _mm_xor_si128( b, signBits );
// Now the signed byte comparisons will give the correct order
const int less = _mm_movemask_epi8( _mm_cmplt_epi8( a, b ) );
const int greater = _mm_movemask_epi8( _mm_cmpgt_epi8( a, b ) );
return less > greater;
}
Run Code Online (Sandbox Code Playgroud)
我们通过将无符号输入范围移位到有符号(通过翻转高位=减去128)来建立无符号比较。