从OCR的图像中隔离单个数字

Question

从OCR的图像中隔离单个数字

Sid*_*hay 5 ocr image-processing image-segmentation

执行单个数字字符识别的方法非常简单.但这是图像只包含一位数.

当图像包含多个数字时,我们不能使用相同的算法,因为整个位图是不同的.我们如何处理图像分割它,所以我们可以"modularise"的OCR操作在每个单独的数字？

Answer 1

但你想要执行的是图像分割问题，而不是数字分类问题。正如@VitaliPro 所说。两者都是 OCR 问题，但是（在极大的简化中）第一个问题是“这是什么字符”，第二个问题是“我这里有多少个字符”。您已经知道如何解决第一个问题了，让我们看看第二个问题通常是如何解决的。

您想要将图像分割成字符（分割中称为“区域”），然后将数字分类应用于每个区域。一种方法是执行分水岭分割，它使用颜色渐变来区分边缘和区域。

可以使用 Python 的 numpy/scipy/skimage 完成一个简单的分水岭，例如：

#!/usr/bin/env python

from PIL import Image
import numpy as np
from scipy import ndimage
from skimage import morphology as morph
from skimage.filter import rank

def big_regions(lb, tot):
    l = []
    for i in range(1, tot+1):
        l.append(((i == lb).sum(), i))
    l.sort()
    l.reverse()
    return l

def segment(img, outimg):
    img = np.array(Image.open(img))
    den = rank.median(img, morph.disk(3))
    # continuous regions (low gradient)
    markers  = rank.gradient(den, morph.disk(5)) < 10
    mrk, tot = ndimage.label(markers)
    grad     = rank.gradient(den, morph.disk(2))
    labels   = morph.watershed(grad, mrk)
    print 'Total regions:', tot
    regs = big_regions(labels, tot)

Run Code Online (Sandbox Code Playgroud)

morph在那里，我使用模块中的分水岭分割skimage。

大多数情况下，对于分水岭，您应该将该区域放置在图像顶部以获得该区域的实际内容，而我在上面的代码中没有这样做。然而，数字或大多数文本不需要这样做，因为它应该是黑白的。

Watershed 使用颜色渐变来识别边缘，但也可以使用 Canny 或 Sobel 过滤器等过滤器。请注意，我正在对图像进行去噪（轻微模糊），以防止发现非常小的区域，因为这些区域很可能是伪影或噪声。使用 Canny 或 Sobel 滤波器可能需要更多的去噪步骤，因为滤波器会产生清晰的边缘。

分割的用途远不止字符分割，它通常用于图像上以区分重要区域（即外观非常相似的大区域）。例如，如果我matplotlib在上面添加一些内容并更改段函数，请说：

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import matplotlib.cm as cm

def plot_seg(spr, spc, sps, img, cmap, alpha, xlabel):
    plt.subplot(spr, spc, sps)
    plt.imshow(img, cmap=cmap, interpolation='nearest', alpha=alpha)
    plt.yticks([])
    plt.xticks([])
    plt.xlabel(xlabel)

def plot_mask(spr, spc, sps, reg, lb, regs, cmap, xlabel):
    masked = np.ma.masked_array(lb, ~(lb == regs[reg][1]))
    plot_seg(spr, spc, sps, masked, cmap, 1, xlabel)

def plot_crop(spr, spc, sps, reg, img, lb, regs, cmap):
    masked = np.ma.masked_array(img, ~(lb == regs[reg][1]))
    crop   = masked[~np.all(masked == 0, axis=1), :]
    crop   = crop[:, ~np.all(crop == 0, axis=0)]
    plot_seg(spr, spc, sps, crop, cmap, 1, '%i px' % regs[reg][0])

def segment(img, outimg):
    img = np.array(Image.open(img))
    den = rank.median(img, morph.disk(3))
    # continuous regions (low gradient)
    markers  = rank.gradient(den, morph.disk(5)) < 10
    mrk, tot = ndimage.label(markers)
    grad     = rank.gradient(den, morph.disk(2))
    labels   = morph.watershed(grad, mrk)
    print 'Total regions:', tot
    regs = big_regions(labels, tot)

    spr = 3
    spc = 6
    plot_seg(spr, spc, 1, img,    cm.gray,     1,   'image')
    plot_seg(spr, spc, 2, den,    cm.gray,     1,   'denoised')
    plot_seg(spr, spc, 3, grad,   cm.spectral, 1,   'gradient')
    plot_seg(spr, spc, 4, mrk,    cm.spectral, 1,   'markers')
    plot_seg(spr, spc, 5, labels, cm.spectral, 1,   'regions\n%i' % tot)
    plot_seg(spr, spc, 6, img,    cm.gray,     1,   'composite')
    plot_seg(spr, spc, 6, labels, cm.spectral, 0.7, 'composite')

    plot_mask(spr, spc, 7,  0, labels, regs, cm.spectral, 'main region')
    plot_mask(spr, spc, 8,  1, labels, regs, cm.spectral, '2nd region')
    plot_mask(spr, spc, 9,  2, labels, regs, cm.spectral, '3rd region')
    plot_mask(spr, spc, 10, 3, labels, regs, cm.spectral, '4th region')
    plot_mask(spr, spc, 11, 4, labels, regs, cm.spectral, '5th region')
    plot_mask(spr, spc, 12, 5, labels, regs, cm.spectral, '6th region')

    plot_crop(spr, spc, 13, 0, img, labels, regs, cm.gray)
    plot_crop(spr, spc, 14, 1, img, labels, regs, cm.gray)
    plot_crop(spr, spc, 15, 2, img, labels, regs, cm.gray)
    plot_crop(spr, spc, 16, 3, img, labels, regs, cm.gray)
    plot_crop(spr, spc, 17, 4, img, labels, regs, cm.gray)
    plot_crop(spr, spc, 18, 5, img, labels, regs, cm.gray)

    plt.show()

Run Code Online (Sandbox Code Playgroud)

（此示例不会自行运行，您需要将上面的其他代码示例添加到其顶部。）

我可以对任何图像进行很好的分割，例如上面的结果：

第一行是segmentation函数的步骤，第二行是区域，第三行是用作图像顶部蒙版的区域。

（PS是的，情节代码相当难看，但很容易理解和更改）

归档时间：	8 年，6 月前
查看次数：	270 次
最近记录：	8 年，5 月前