在 Python、OpenCV 中根据优先级对轮廓进行排序

Question

在 Python、OpenCV 中根据优先级对轮廓进行排序

Jim*_*ela 5 python opencv contour python-3.7

我试图根据轮廓的到达对轮廓进行排序，left-to-right就像top-to-bottom你写任何东西一样。从，top然后left，以相应的情况为准。

到目前为止，这就是我所取得的成就和方式：

def get_contour_precedence(contour, cols):
    tolerance_factor = 61
    origin = cv2.boundingRect(contour)
    return ((origin[1] // tolerance_factor) * tolerance_factor) * cols + origin[0]


image = cv2.imread("C:/Users/XXXX/PycharmProjects/OCR/raw_dataset/23.png", 0)

ret, thresh1 = cv2.threshold(image, 130, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

contours, h = cv2.findContours(thresh1.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# perform edge detection, find contours in the edge map, and sort the
# resulting contours from left-to-right
contours.sort(key=lambda x: get_contour_precedence(x, thresh1.shape[1]))

# initialize the list of contour bounding boxes and associated
# characters that we'll be OCR'ing
chars = []
inc = 0
# loop over the contours
for c in contours:
    inc += 1

    # compute the bounding box of the contour
    (x, y, w, h) = cv2.boundingRect(c)

    label = str(inc)
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
    cv2.putText(image, label, (x - 2, y - 2),
                cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
    print('x=', x)
    print('y=', y)
    print('x+w=', x + w)
    print('y+h=', y + h)
    crop_img = image[y + 2:y + h - 1, x + 2:x + w - 1]
    name = os.path.join("bounding boxes", 'Image_%d.png' % (
        inc))
    cv2.imshow("cropped", crop_img)
    print(name)
    crop_img = Image.fromarray(crop_img)
    crop_img.save(name)
    cv2.waitKey(0)

cv2.imshow('mat', image)
cv2.waitKey(0)

Run Code Online (Sandbox Code Playgroud)

输入图像：

输出图像 1：

输入图像 2：

图 2 的输出：

输入图像 3：

输出图像 3：

正如您所看到的，1、2、3、4 并不是我所期望的每个图像，如图像编号 3 中所示。

我如何调整它以使其工作甚至编写自定义函数？

注意：我的问题中提供了同一输入图像的多个图像。内容是相同的，但它们的文本有所不同，因此tolerance factor不适用于其中的每一个。手动调整它不是一个好主意。

Answer 1

sta*_*ine 4

这是我对这个问题的看法。我将向您介绍它的一般要点，然后是我在C++. 主要思想是我想从左到右、从上到下处理图像。我将在找到每个斑点（或轮廓）时对其进行处理，但是，我需要几个中间步骤来实现成功的（有序的）分割。

\n

使用行垂直排序

\n
第一步是尝试按行 \xe2\x80\x93 对blob 进行排序，这意味着每行都有一组（无序）水平blob。没关系。第一步是计算某种垂直排序，如果我们从上到下处理每一行，我们就能实现这一点。
\n
在按行（垂直）对斑点进行排序后，我可以检查它们的质心（或质心）并对它们进行水平排序。我的想法是，我将逐行处理，并且for每行对 blob centroids进行排序。让\xe2\x80\x99s 看看我在这里想要实现的目标的示例。
\n
这是您的输入图像：
\n\n
这就是我所说的行掩码：
\n\n
最后一个图像包含白色区域，每个区域代表一个“行”。每行都有一个数字（例如，Row1、Row2等），并且每行都row包含一组 blob（在本例中为字符）。通过从上到下处理每个row，您已经在垂直轴上对斑点进行了排序。
\n
如果我从上到下对每一行进行编号，我会得到以下图像：
\n\n
行掩码是一种创建“斑点行”的方法，并且可以从形态上计算该掩码。查看重叠的 2 个图像，以便您更好地了解处理顺序：
\n\n
我们在这里要做的是，首先，垂直排序（蓝色箭头），然后我们将处理水平（红色箭头）排序。您可以看到，通过处理每一行，我们可以（可能）克服排序问题！
\n
使用质心进行水平排序
\n
现在让我们看看如何对 blob 进行排序horizontally。如果我们创建一个更简单的图像，其中 awidth等于输入图像，并且 aheight等于Row Maskrows中的数量，我们可以简单地覆盖每个斑点质心的每个水平坐标（x 坐标）。看看这个例子：
\n\n
这是一个行表。每行代表在Row Mask中找到的行数，并且也是从上到下读取的。表格的与输入图像的width相同，并且在空间上对应于水平轴。每个方块都是输入图像中的一个像素，仅使用水平坐标映射到行表（因为我们对行的简化非常简单）。行表中每个像素的实际值是 a ，标记输入图像上的每个斑点。请注意，标签不是有序的！widthlabel
\n
因此，例如，此表显示，在第 1 行（您已经知道第 1 行 \xe2\x80\x93 是什么，它是行掩码上的第一个白色区域）中的位置(1,4)\xe2\x80\ x99s 斑点编号3。位置上(1,6)有 blob number 2，等等。（我认为）这个表的酷之处在于你可以循环遍历它，并且for的每个值不同0，水平排序变得非常微不足道。这是现在从左到右排序的行表：
\n\n
将 blob 信息与质心映射
\n
我们将使用斑点质心来表示map两个表示（行掩码/行表）之间的信息。假设您已经拥有两个“辅助”图像，并且一次处理输入图像上的每个斑点（或轮廓）。例如，您可以从以下开始：
\n\n
好吧，这里有一个斑点。我们如何将其映射到行掩码和行表？使用它的质心。如果我们计算质心（如图中绿点所示），我们可以构造质dictionary心和标签。例如，对于此 blob，centroid位于(271,193)。好的，让\xe2\x80\x99s 分配label = 1. 现在我们有了这本字典：
\n\n
现在，我们在行掩码上row使用相同的方法找到该斑点的放置位置。centroid像这样的东西：
\n
rowNumber = rowMask.at( 271,193 )\n
Run Code Online (Sandbox Code Playgroud)\n
此操作应该返回rownNumber = 3。好的！我们知道我们的斑点放在哪一行，因此它现在是垂直排序的。现在，让我们将其水平坐标存储在行表中：
\n
rowTable.at( 271, 193 ) = 1\n
Run Code Online (Sandbox Code Playgroud)\n
现在，rowTable（在其行和列中）保存已处理 blob 的标签。行表应该看起来像这样：
\n\n
该表格要宽得多，因为它的水平尺寸必须与您的输入图像相同。在此图像中，label 1放置在Column 271, Row 3.如果这是图像上唯一的斑点，则斑点将已经排序。但是，如果您在 , 中添加另一个斑点，会发生什么Column 2？Row 1这就是为什么在处理完所有 blob \xe2\x80\x93 后需要再次遍历此表以正确更正其标签的原因。
\n
C++ 中的实现
\n
好吧，希望算法应该有点清晰（如果不是，就问吧，我的朋友）。我将尝试在OpenCV使用中实现这些想法C++。首先，我需要binary image您的意见。使用Otsu\xe2\x80\x99s thresholding方法计算很简单：
\n
//Read the input image:\nstd::string imageName = "C://opencvImages//yFX3M.png";\ncv::Mat testImage = cv::imread( imageName );\n\n//Compute grayscale image\ncv::Mat grayImage;\ncv::cvtColor( testImage, grayImage, cv::COLOR_RGB2GRAY );\n\n//Get binary image via Otsu:\ncv::Mat binImage;\ncv::threshold( grayImage, binImage, 0, 255, cv::THRESH_OTSU );\n\n//Invert image:\nbinImage = 255 - binImage;\n
Run Code Online (Sandbox Code Playgroud)\n
这是生成的二进制图像，没什么花哨的，正是我们开始工作所需要的：
\n\n
第一步是获取Row Mask. 这可以使用形态学来实现。只需应用一个dilation + erosion非常大的水平线即可structuring element。这个想法是你想把这些斑点变成矩形，将它们水平地“融合”在一起：
\n
//Create a hard copy of the binary mask:\ncv::Mat rowMask = binImage.clone();\n\n//horizontal dilation + erosion:\nint horizontalSize = 100; // a very big horizontal structuring element\ncv::Mat SE = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(horizontalSize,1) );\ncv::morphologyEx( rowMask, rowMask, cv::MORPH_DILATE, SE, cv::Point(-1,-1), 2 );\ncv::morphologyEx( rowMask, rowMask, cv::MORPH_ERODE, SE, cv::Point(-1,-1), 1 );\n
Run Code Online (Sandbox Code Playgroud)\n
结果如下Row Mask：
\n\n
这非常酷，现在我们有了Row Mask，我们必须对它们进行编号，好吗？有很多方法可以做到这一点，但现在我对更简单的方法感兴趣：循环遍历该图像并获取每个像素。If像素是白色的，使用Flood Fill操作将图像的该部分标记为唯一的斑点（或在本例中为行）。这可以按如下方式完成：
\n
//Label the row mask:\nint rowCount = 0; //This will count our rows\n\n//Loop thru the mask:\nfor( int y = 0; y < rowMask.rows; y++ ){\n for( int x = 0; x < rowMask.cols; x++ ){\n //Get the current pixel:\n uchar currentPixel = rowMask.at<uchar>( y, x );\n //If the pixel is white, this is an unlabeled blob:\n if ( currentPixel == 255 ) {\n //Create new label (different from zero):\n rowCount++;\n //Flood fill on this point:\n cv::floodFill( rowMask, cv::Point( x, y ), rowCount, (cv::Rect*)0, cv::Scalar(), 0 );\n }\n }\n}\n
Run Code Online (Sandbox Code Playgroud)\n
1此过程将标记从到的所有行r。这就是我们想要的。如果您检查图像，您会隐约看到行，那是因为我们的标签对应于灰度像素的非常低的强度值。
\n
好的，现在让我们准备行表。这个“表格”实际上只是另一个图像，请记住：宽度与输入相同，高度与您计算的行数相同Row Mask：
\n
//create rows image:\ncv::Mat rowTable = cv::Mat::zeros( cv::Size(binImage.cols, rowCount), CV_8UC1 );\n//Just for convenience:\nrowTable = 255 - rowTable;\n
Run Code Online (Sandbox Code Playgroud)\n
在这里，为了方便起见，我只是反转了最终图像。因为我想实际查看表格是如何填充（非常低强度）像素的，并确保一切都按预期工作。
\n
有趣的来了。我们已经准备好了两个图像（或数据容器）。我们需要独立处理每个 blob。这个想法是，您必须从二进制图像中提取每个斑点/轮廓/字符并计算它centroid并分配一个新的label. 同样，有很多方法可以做到这一点。在这里，我使用以下方法：
\n
我将循环遍历binary mask. current biggest blob我将从这个二进制输入中得到。我将计算它centroid并将其数据存储在所需的每个容器中，然后，我将从delete掩码中提取该斑点。我将重复这个过程，直到不再有斑点为止。这是我这样做的方式，特别是因为我已经为此编写了函数。这是方法：
\n
//Prepare a couple of dictionaries for data storing:\nstd::map< int, cv::Point > blobMap; //holds label, gives centroid\nstd::map< int, cv::Rect > boundingBoxMap; //holds label, gives bounding box\n
Run Code Online (Sandbox Code Playgroud)\n
首先，两个dictionaries。一个接收一个斑点标签并返回质心。另一个接收相同的标签并返回边界框。
\n
//Extract each individual blob:\ncv::Mat bobFilterInput = binImage.clone();\n\n//The new blob label:\nint blobLabel = 0;\n\n//Some control variables:\nbool extractBlobs = true; //Controls loop\nint currentBlob = 0; //Counter of blobs\n\nwhile ( extractBlobs ){\n\n //Get the biggest blob:\n cv::Mat biggestBlob = findBiggestBlob( bobFilterInput );\n\n //Compute the centroid/center of mass:\n cv::Moments momentStructure = cv::moments( biggestBlob, true );\n float cx = momentStructure.m10 / momentStructure.m00;\n float cy = momentStructure.m01 / momentStructure.m00;\n\n //Centroid point:\n cv::Point blobCentroid;\n blobCentroid.x = cx;\n blobCentroid.y = cy;\n\n //Compute bounding box:\n boundingBox boxData;\n computeBoundingBox( biggestBlob, boxData );\n\n //Convert boundingBox data into opencv rect data:\n cv::Rect cropBox = boundingBox2Rect( boxData );\n\n\n //Label blob:\n blobLabel++;\n blobMap.emplace( blobLabel, blobCentroid );\n boundingBoxMap.emplace( blobLabel, cropBox );\n\n //Get the row for this centroid\n int blobRow = rowMask.at<uchar>( cy, cx );\n blobRow--;\n\n //Place centroid on rowed image:\n rowTable.at<uchar>( blobRow, cx ) = blobLabel;\n\n //Resume blob flow control:\n cv::Mat blobDifference = bobFilterInput - biggestBlob;\n //How many pixels are left on the new mask?\n int pixelsLeft = cv::countNonZero( blobDifference );\n bobFilterInput = blobDifference;\n\n //Done extracting blobs?\n if ( pixelsLeft <= 0 ){\n extractBlobs = false;\n }\n\n //Increment blob counter:\n currentBlob++;\n\n}\n
Run Code Online (Sandbox Code Playgroud)\n
查看一个漂亮的动画，了解此处理如何遍历每个 blob、处理它并删除它，直到\xe2\x80\x99s 没有留下任何内容：
\n\n
现在，对上面的片段进行一些注释。我有一些辅助函数：maximumBlob和computeBoundingBox. 这些函数计算二进制图像中的最大斑点，并将边界框的自定义结构分别转换为OpenCV\xe2\x80\x99sRect结构。这些是这些函数执行的操作。
\n
该片段的“核心”是这样的：一旦你有了一个孤立的 blob，就计算它centroid（我实际上计算了center of massvia central moments）。生成一个新的label. 将其存储label在字典centroid中dictionary（就我而言）blobMap。另外计算bounding box并将其存储在另一个dictionary,中boundingBoxMap：
\n
//Label blob:\nblobLabel++;\nblobMap.emplace( blobLabel, blobCentroid );\nboundingBoxMap.emplace( blobLabel, cropBox );\n
Run Code Online (Sandbox Code Playgroud)\n
现在，使用该 blob 的对应centroid数据。获得行后，将此数字存储到行表中：fetchrow
\n
//Get the row for this centroid\nint blobRow = rowMask.at<uchar>( cy, cx );\nblobRow--;\n\n//Place centroid on rowed image:\nrowTable.at<uchar>( blobRow, cx ) = blobLabel;\n
Run Code Online (Sandbox Code Playgroud)\n
出色的。此时，您已准备好行表。让\xe2\x80\x99s 循环遍历它，最后，对那些该死的斑点进行排序：
\n
int blobCounter = 1; //The ORDERED label, starting at 1\nfor( int y = 0; y < rowTable.rows; y++ ){\n for( int x = 0; x < rowTable.cols; x++ ){\n //Get current label:\n uchar currentLabel = rowTable.at<uchar>( y, x );\n //Is it a valid label?\n if ( currentLabel != 255 ){\n //Get the bounding box for this label:\n cv::Rect currentBoundingBox = boundingBoxMap[ currentLabel ];\n cv::rectangle( testImage, currentBoundingBox, cv::Scalar(0,255,0), 2, 8, 0 );\n //The blob counter to string:\n std::string counterString = std::to_string( blobCounter );\n cv::putText( testImage, counterString, cv::Point( currentBoundingBox.x, currentBoundingBox.y-1 ),\n cv::FONT_HERSHEY_SIMPLEX, 0.7, cv::Scalar(255,0,0), 1, cv::LINE_8, false );\n blobCounter++; //Increment the blob/label\n }\n }\n}\n
Run Code Online (Sandbox Code Playgroud)\n
没什么花哨的，只是一个常规的嵌套for循环，循环遍历row table. 如果像素与白色不同，请使用来label检索和centroid，bounding box然后将更改label为递增的数字。为了显示结果，我只需在原始图像上绘制边界框和新标签。
\n
查看此动画中的有序处理：
\n\n
非常酷，这是一个额外的动画，行表填充了水平坐标：
\n\n

归档时间：	5 年，6 月前
查看次数：	2990 次
最近记录：	3 年，6 月前