我经常需要计算积分图像.这是一个简单的算法:
uint32_t void integral_sum(const uint8_t * src, size_t src_stride, size_t width, size_t height, uint32_t * sum, size_t sum_stride)
{
memset(sum, 0, (width + 1) * sizeof(uint32_t));
sum += sum_stride + 1;
for (size_t row = 0; row < height; row++)
{
uint32_t row_sum = 0;
sum[-1] = 0;
for (size_t col = 0; col < width; col++)
{
row_sum += src[col];
sum[col] = row_sum + sum[col - sum_stride];
}
src += src_stride;
sum += sum_stride;
}
}
Run Code Online (Sandbox Code Playgroud)
我有一个问题.我可以加速这种算法(例如,使用SSE或AVX)吗?