最初调查#pragma omp simd指令的效果,我遇到了一个我无法解释的行为,与一个简单的for循环的矢量化有关.下面的代码示例可以在此真棒测试编译器的探险家,提供的-O3指令应用,我们是x86架构.
有人能解释一下以下观察背后的逻辑吗?
#include <stdint.h>
void test(uint8_t* out, uint8_t const* in, uint32_t length)
{
unsigned const l1 = (length * 32)/32; // This is vectorized
unsigned const l2 = (length / 32)*32; // This is not vectorized
unsigned const l3 = (length << 5)>>5; // This is vectorized
unsigned const l4 = (length >> 5)<<5; // This is not vectorized
unsigned const l5 = length -length%32; // This is not vectorized
unsigned const …Run Code Online (Sandbox Code Playgroud)