(opencv rc1)是什么导致Mat乘法比每像素乘法慢20倍？

Question

(opencv rc1)是什么导致Mat乘法比每像素乘法慢20倍？

Boy*_*nov 8 c++ java-native-interface opencv arm neon

// 700 ms
cv::Mat in(height,width,CV_8UC1);
in /= 4;

Run Code Online (Sandbox Code Playgroud)

替换为

//40 ms
cv::Mat in(height,width,CV_8UC1);
for (int y=0; y < in.rows; ++y)
{
    unsigned char* ptr = in.data + y*in.step1();
    for (int x=0; x < in.cols; ++x)
    {
        ptr[x] /= 4;
    }
}

Run Code Online (Sandbox Code Playgroud)

什么可以导致这种行为？是因为opencv"推广"Mat与Scalar乘法Mat匹配Mat乘法,还是特定的失败优化？(启用了NEON).

Answer 1

Mic*_*cka 1

通过测量 cpu 时间尝试了相同的方法。

int main()
{
    clock_t startTime;
    clock_t endTime;

    int height =1024;
    int width =1024;

    // 700 ms
    cv::Mat in(height,width,CV_8UC1, cv::Scalar(255));
    std::cout << "value: " << (int)in.at<unsigned char>(0,0) << std::endl;

    cv::Mat out(height,width,CV_8UC1);

    startTime = clock();
    out = in/4;
    endTime = clock();
    std::cout << "1: " << (float)(endTime-startTime)/(float)CLOCKS_PER_SEC << std::endl;
    std::cout << "value: " << (int)out.at<unsigned char>(0,0) << std::endl;


    startTime = clock();
    in /= 4;
    endTime = clock();
    std::cout << "2: " <<  (float)(endTime-startTime)/(float)CLOCKS_PER_SEC << std::endl;
    std::cout << "value: " << (int)in.at<unsigned char>(0,0) << std::endl;

    //40 ms
    cv::Mat in2(height,width,CV_8UC1, cv::Scalar(255));

    startTime = clock();
    for (int y=0; y < in2.rows; ++y)
    {
        //unsigned char* ptr = in2.data + y*in2.step1();
        unsigned char* ptr = in2.ptr(y);
        for (int x=0; x < in2.cols; ++x)
        {
            ptr[x] /= 4;
        }
    }
    std::cout << "value: " << (int)in2.at<unsigned char>(0,0) << std::endl;

    endTime = clock();
    std::cout << "3: " <<  (float)(endTime-startTime)/(float)CLOCKS_PER_SEC << std::endl;


    cv::namedWindow("...");
    cv::waitKey(0);
}

Run Code Online (Sandbox Code Playgroud)

结果：

value: 255
1: 0.016
value: 64
2: 0.016
value: 64
3: 0.003
value: 63

Run Code Online (Sandbox Code Playgroud)

您会看到结果不同，可能是因为mat.divide()执行浮点除法并舍入到下一个。虽然您在更快的版本中使用整数除法，但速度更快，但结果不同。

另外，openCV计算中有一个saturate_cast，但我猜更大的计算负载差异将是双精度除法。

归档时间：	11 年前
查看次数：	834 次
最近记录：	11 年前