这些功能有什么区别吗?如果不是,为什么?
__m128 __mm_set1_ps(float a)
__m128 __mm_set_ps1(float a)
Run Code Online (Sandbox Code Playgroud)
Intel Intrinsics Guide网站上的两种描述是相同的。谢谢你。
有零差异。 _mm_set1_ps
是惯用的,使用它。
例如,clang 的 xmmintrin.h 定义set_ps1
为set1_ps
:
static __inline__ __m128 __DEFAULT_FN_ATTRS
_mm_set_ps1(float __w)
{
return _mm_set1_ps(__w);
}
Run Code Online (Sandbox Code Playgroud)
My guess is that Intel just hadn't settled on a naming scheme back in the early days of SSE1, and switched to _mm_set1_
type going forward. But if they'd already documented _mm_set_ps1
, they couldn't take it back.
Note that there is no _mm_set_epi321
or _mm_set_ep81
(fortunately)! Thus _mm_set1_ps
is idiomatic and follows the same pattern as the other broadcast intrinsics, while _mm_set_ps1
is unusual and will surprise human readers. There is _mm_set_pd1
and _mm_load_pd1
, though, and presumably they were introduced at the same time (with SSE2).
I only know about it because I stumbled over it the other day while looking for an intrinsic that would do a strict-aliasing-safe broadcast load, like you could with vpbroadcastd
in asm. (There isn't a portable one that compiles efficiently everywhere; compiler support for intrinsics is a mixed bag of braindead pessimizations and missing intrinsics when you try to do anything complicated. Maybe in a few more years _mm_loadu_si32(void*)
to zero extend will at least be widely supported..) /end off topic rant.