标签: intel-advisor

低效的内存访问模式和不规则的步幅访问

我正在尝试优化这个功能:

bool interpolate(const Mat &im, float ofsx, float ofsy, float a11, float a12, float a21, float a22, Mat &res)
{         
   bool ret = false;
   // input size (-1 for the safe bilinear interpolation)
   const int width = im.cols-1;
   const int height = im.rows-1;
   // output size
   const int halfWidth  = res.cols >> 1;
   const int halfHeight = res.rows >> 1;
   float *out = res.ptr<float>(0);
   const float *imptr  = im.ptr<float>(0);
   for (int j=-halfHeight; j<=halfHeight; ++j)
   {
      const float rx = ofsx …
Run Code Online (Sandbox Code Playgroud)

c++ parallel-processing intel vectorization intel-advisor

5
推荐指数
1
解决办法
472
查看次数