根据文本方向检测图像方向角

Question

根据文本方向检测图像方向角

Rav*_*avi 9 python opencv image image-processing computer-vision

我正在处理一项 OCR 任务，以从多个身份证明文件中提取信息。一项挑战是扫描图像的方向。需要确定 PAN、Aadhaar、驾驶执照或任何身份证明的扫描图像的方向。

已经在 Stackoverflow 和其他论坛上尝试了所有建议的方法，例如 OpenCV minAreaRect、Hough Lines Transforms、FFT、homography、tesseract osd with psm 0。都没有工作。

逻辑应该返回文本方向的角度 - 0、90 和 270 度。附上 0、90 和 270 度的图像。这不是关于确定偏度。

Answer 1

nat*_*ncy 13

这是一种基于大多数文本偏向一侧的假设的方法。这个想法是我们可以根据主要文本区域所在的位置来确定角度

将图像转换为灰度和高斯模糊
自适应阈值获取二值图像
使用轮廓区域查找轮廓并过滤
在蒙版上绘制过滤的轮廓
根据方向水平或垂直分割图像
计算每一半的像素数

转换为灰度和高斯模糊后，我们自适应阈值以获得二值图像

从这里我们找到轮廓并使用轮廓区域过滤以去除小噪声粒子和大边界。我们将通过此过滤器的任何轮廓绘制到蒙版上

为了确定角度，我们根据图像的尺寸将图像分成两半。如果width > height那么它必须是一个水平图像，所以我们垂直分成两半。如果height > width那么它必须是一个垂直图像所以我们水平分成两半

现在我们有两半，我们可以cv2.countNonZero()用来确定每一半的白色像素数量。这是确定角度的逻辑：

if horizontal
    if left >= right 
        degree -> 0
    else 
        degree -> 180
if vertical
    if top >= bottom
        degree -> 270
    else
        degree -> 90

Run Code Online (Sandbox Code Playgroud)

离开 9703

对 3975

因此图像是0度。这是其他方向的结果

离开 3975

对 9703

我们可以得出结论，图像翻转了 180 度

这是垂直图像的结果。注意因为它是一个垂直图像，我们水平分割

顶 3947

底部 9550

因此结果是 90 度

import cv2
import numpy as np

def detect_angle(image):
    mask = np.zeros(image.shape, dtype=np.uint8)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray, (3,3), 0)
    adaptive = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,15,4)

    cnts = cv2.findContours(adaptive, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]

    for c in cnts:
        area = cv2.contourArea(c)
        if area < 45000 and area > 20:
            cv2.drawContours(mask, [c], -1, (255,255,255), -1)

    mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
    h, w = mask.shape
    
    # Horizontal
    if w > h:
        left = mask[0:h, 0:0+w//2]
        right = mask[0:h, w//2:]
        left_pixels = cv2.countNonZero(left)
        right_pixels = cv2.countNonZero(right)
        return 0 if left_pixels >= right_pixels else 180
    # Vertical
    else:
        top = mask[0:h//2, 0:w]
        bottom = mask[h//2:, 0:w]
        top_pixels = cv2.countNonZero(top)
        bottom_pixels = cv2.countNonZero(bottom)
        return 90 if bottom_pixels >= top_pixels else 270

if __name__ == '__main__':
    image = cv2.imread('1.png')
    angle = detect_angle(image)
    print(angle)

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，5 月前
查看次数：	4293 次
最近记录：	4 年，6 月前