Faster method of accessing a channel from RGB image in OpenCV?

Tae*_*hin 2 c++ opencv image

In my trials with images of 1409x900 and 960x696, it takes 2.5 ms on average to split channels of a RGB image using OpenCV in my 64-bit 6-core 3.2 GHz Windows machine.

vector<cv::Mat> channels;
cv::split(img, channels);
Run Code Online (Sandbox Code Playgroud)

I found that this is almost similar amount of time for the other image processing (boolean operation + morphological opening).

Considering my code only uses an image of a channel from the splitting, I wonder if there is any faster way of extracting single channel from a RGB image, preferably with OpenCV.

UPDATE

As @DanMašek pointed out, there was another function mixChannels that can extract a single channel image from multi-channel. I've tested about 2000 images with the same sizes. mixChannels took about 1 ms on average. For now, I am satisfied with the result. But post your answer if you can make it faster.

cv::Mat channel(img.rows, img.cols, CV_8UC1);
int from_to[] = { sel_channel,0 };
mixChannels(&img, 1, &channel, 1, from_to, 1);
Run Code Online (Sandbox Code Playgroud)

Dan*_*šek 5

这里想到了两个简单的选择。

  1. 您提到您对从相机拍摄的图像重复执行此操作。因此可以安全地假设图像总是相同的大小。

    的分配cv::Mat具有不可忽略的开销,因此在这种情况下,重用通道Mats将是有益的。(即在接收到第一帧时分配目标图像,然后只覆盖后续帧的内容)

    这种方法的额外好处是(很可能)减少内存碎片。这可能成为 32 位代码的真正问题。

  2. 您提到您只对一个特定频道(用户可以任意选择)感兴趣。这意味着您可以使用cv::mixChannels,这使您可以灵活地选择哪些频道以及如何提取它们。

    这意味着您只能为单个通道提取数据,理论上(取决于实现 - 研究源代码以获取更多详细信息)避免了为您不感兴趣的通道提取和/或复制数据的开销。


让我们制作一个测试程序来评估上述方法的 4 种可能组合。

  • 变体 0:cv::split不重复使用
  • 变体 1:cv::split重用
  • 变体 2:cv::mixChannels不重复使用
  • 变体 3:cv::mixChannels重用

注意:我在static这里只是为了简单起见,通常我会把这个成员变量放在一个包装算法的类中。


#include <opencv2/opencv.hpp>

#include <chrono>
#include <cstdint>
#include <iostream>
#include <vector>

#define SELECTED_CHANNEL 1

cv::Mat variant_0(cv::Mat const& img)
{
    std::vector<cv::Mat> channels;
    cv::split(img, channels);
    return channels[SELECTED_CHANNEL];
}

cv::Mat variant_1(cv::Mat const& img)
{
    static std::vector<cv::Mat> channels;
    cv::split(img, channels);
    return channels[SELECTED_CHANNEL];
}

cv::Mat variant_2(cv::Mat const& img)
{
    // NB: output Mat must be preallocated
    cv::Mat channel(img.rows, img.cols, CV_8UC1);
    int from_to[] = { SELECTED_CHANNEL, 0 };
    cv::mixChannels(&img, 1, &channel, 1, from_to, 1);
    return channel;
}

cv::Mat variant_3(cv::Mat const& img)
{
    // NB: output Mat must be preallocated
    static cv::Mat channel(img.rows, img.cols, CV_8UC1);
    int from_to[] = { SELECTED_CHANNEL, 0 };
    cv::mixChannels(&img, 1, &channel, 1, from_to, 1);
    return channel;
}

template<typename T>
void timeit(std::string const& title, T f)
{
    using std::chrono::high_resolution_clock;
    using std::chrono::duration_cast;
    using std::chrono::microseconds;

    cv::Mat img(1024,1024, CV_8UC3);
    cv::randu(img, 0, 256);

    int32_t const STEPS(1024);

    high_resolution_clock::time_point t1 = high_resolution_clock::now();
    for (uint32_t i(0); i < STEPS; ++i) {
        cv::Mat result = f(img);
    }
    high_resolution_clock::time_point t2 = high_resolution_clock::now();

    auto duration = duration_cast<microseconds>(t2 - t1).count();
    double t_ms(static_cast<double>(duration) / 1000.0);
    std::cout << title << "\n"
        << "Total = " << t_ms << " ms\n"
        << "Iteration = " << (t_ms / STEPS) << " ms\n"
        << "FPS = " << (STEPS / t_ms * 1000.0) << "\n"
        << "\n";
}

int main()
{
    for (uint8_t i(0); i < 2; ++i) {
        timeit("Variant 0", variant_0);
        timeit("Variant 1", variant_1);
        timeit("Variant 2", variant_2);
        timeit("Variant 3", variant_3);
        std::cout << "--------------------------\n\n";
    }

    return 0;
}
Run Code Online (Sandbox Code Playgroud)

第二遍的输出(因此我们避免了任何预热成本)。

注意:在 i7-4930K 上运行它,使用 OpenCV 3.1.0(64 位,MSVC12.0),Windows 10 -- YMMV,尤其是带有 AVX2 的 CPU

Variant 0
Total = 1518.69 ms
Iteration = 1.48309 ms
FPS = 674.267

Variant 1
Total = 359.048 ms
Iteration = 0.350633 ms
FPS = 2851.99

Variant 2
Total = 820.223 ms
Iteration = 0.800999 ms
FPS = 1248.44

Variant 3
Total = 427.089 ms
Iteration = 0.417079 ms
FPS = 2397.63
Run Code Online (Sandbox Code Playgroud)

有趣的是,cv::split重用在这里获胜。随意编辑答案并添加来自不同平台/CPU 代的时间(尤其是在比例完全不同的情况下)。

似乎在我的设置中,这些都没有很好地并行化,因此这可能是加快速度的另一种可能途径(类似于cv::parallel_for_)。