小编Nic*_*Lee的帖子

使用ARM NEON内在函数添加alpha和permute

我正在开发一个需要从RGB转换图像的iOS应用程序 - > BGRA相当快.如果可能的话,我想使用NEON内在函数.有没有比简单分配组件更快的方法?

void neonPermuteRGBtoBGRA(unsigned char* src, unsigned char* dst, int numPix)
{
    numPix /= 8; //process 8 pixels at a time

    uint8x8_t alpha = vdup_n_u8 (0xff);

    for (int i=0; i<numPix; i++)
    {
        uint8x8x3_t rgb  = vld3_u8 (src);
        uint8x8x4_t bgra;

        bgra.val[0] = rgb.val[2]; //these lines are slow
        bgra.val[1] = rgb.val[1]; //these lines are slow 
        bgra.val[2] = rgb.val[0]; //these lines are slow

        bgra.val[3] = alpha;

        vst4_u8(dst, bgra);

        src += 8*3;
        dst += 8*4;
    }


}
Run Code Online (Sandbox Code Playgroud)

arm intrinsics neon cortex-a8

5
推荐指数
2
解决办法
2299
查看次数

标签 统计

arm ×1

cortex-a8 ×1

intrinsics ×1

neon ×1