sin*_*ium 35 python opencv image-rotation skew
我有几百张图像(扫描的文档),其中大多数是歪斜的。我想使用Python使它们偏斜。
这是我使用的代码:
import numpy as np
import cv2
from skimage.transform import radon
filename = 'path_to_filename'
# Load file, converting to grayscale
img = cv2.imread(filename)
I = cv2.cvtColor(img, COLOR_BGR2GRAY)
h, w = I.shape
# If the resolution is high, resize the image to reduce processing time.
if (w > 640):
I = cv2.resize(I, (640, int((h / w) * 640)))
I = I - np.mean(I) # Demean; make the brightness extend above and below zero
# Do the radon transform
sinogram = radon(I)
# Find the RMS value of each row and find "busiest" rotation,
# where the transform is lined up perfectly with the alternating dark
# text and white lines
r = np.array([np.sqrt(np.mean(np.abs(line) ** 2)) for line in sinogram.transpose()])
rotation = np.argmax(r)
print('Rotation: {:.2f} degrees'.format(90 - rotation))
# Rotate and save with the original resolution
M = cv2.getRotationMatrix2D((w/2,h/2),90 - rotation,1)
dst = cv2.warpAffine(img,M,(w,h))
cv2.imwrite('rotated.jpg', dst)
Run Code Online (Sandbox Code Playgroud)
该代码对大多数文档都适用,除了某些角度:(180和0)和(90和270)通常被检测为相同角度(即,它不会使(180和0)与(90和90之间产生差异)。 270))。所以我得到了很多颠倒的文件。
我得到的结果图像与输入图像相同。
是否有建议使用Opencv和Python检测图像是否颠倒?
PS:我尝试使用EXIF数据检查方向,但没有找到任何解决方案。
编辑:
可以使用Tesseract(Python的pytesseract)检测方向,但是仅当图像包含很多字符时才有可能。
对于可能需要此功能的任何人:
import cv2
import pytesseract
print(pytesseract.image_to_osd(cv2.imread(file_name)))
Run Code Online (Sandbox Code Playgroud)
如果文档包含足够的字符,则Tesseract可以检测方向。但是,当图像的线条很少时,Tesseract建议的定向角度通常是错误的。因此,这不是100%的解决方案。
Ste*_*hke 21
Python3 / OpenCV4脚本可对齐扫描的文档。
旋转文档并汇总行。当文档旋转0度和180度时,图像中会出现很多黑色像素:
使用分数保持方法。对每个图像进行评分,以使其类似于斑马纹。得分最高的图像具有正确的旋转度。您链接的图像偏离了0.5度。为了便于阅读,我省略了一些功能,完整的代码可以在此处找到。
# Rotate the image around in a circle
angle = 0
while angle <= 360:
# Rotate the source image
img = rotate(src, angle)
# Crop the center 1/3rd of the image (roi is filled with text)
h,w = img.shape
buffer = min(h, w) - int(min(h,w)/1.15)
roi = img[int(h/2-buffer):int(h/2+buffer), int(w/2-buffer):int(w/2+buffer)]
# Create background to draw transform on
bg = np.zeros((buffer*2, buffer*2), np.uint8)
# Compute the sums of the rows
row_sums = sum_rows(roi)
# High score --> Zebra stripes
score = np.count_nonzero(row_sums)
scores.append(score)
# Image has best rotation
if score <= min(scores):
# Save the rotatied image
print('found optimal rotation')
best_rotation = img.copy()
k = display_data(roi, row_sums, buffer)
if k == 27: break
# Increment angle and try again
angle += .75
cv2.destroyAllWindows()
Run Code Online (Sandbox Code Playgroud)
如何判断文件是否颠倒?填写从文档顶部到图像中第一个非黑色像素的区域。用黄色测量面积。面积最小的图像将是正面朝上的图像:
# Find the area from the top of page to top of image
_, bg = area_to_top_of_text(best_rotation.copy())
right_side_up = sum(sum(bg))
# Flip image and try again
best_rotation_flipped = rotate(best_rotation, 180)
_, bg = area_to_top_of_text(best_rotation_flipped.copy())
upside_down = sum(sum(bg))
# Check which area is larger
if right_side_up < upside_down: aligned_image = best_rotation
else: aligned_image = best_rotation_flipped
# Save aligned image
cv2.imwrite('/home/stephen/Desktop/best_rotation.png', 255-aligned_image)
cv2.destroyAllWindows()
Run Code Online (Sandbox Code Playgroud)
Assuming you did run the angle-correction already on the image, you can try the following to find out if it is flipped:
The peak finding in step 3 is done by finding sections with above average values. The sub-peaks are then found via argmax.
Here's a figure to illustrate the approach; A few lines of you example image
here's some code that does this:
import cv2
import numpy as np
# load image, convert to grayscale, threshold it at 127 and invert.
page = cv2.imread('Page.jpg')
page = cv2.cvtColor(page, cv2.COLOR_BGR2GRAY)
page = cv2.threshold(page, 127, 255, cv2.THRESH_BINARY_INV)[1]
# project the page to the side and smooth it with a gaussian
projection = np.sum(page, 1)
gaussian_filter = np.exp(-(np.arange(-3, 3, 0.1)**2))
gaussian_filter /= np.sum(gaussian_filter)
smooth = np.convolve(projection, gaussian_filter)
# find the pixel values where we expect lines to start and end
mask = smooth > np.average(smooth)
edges = np.convolve(mask, [1, -1])
line_starts = np.where(edges == 1)[0]
line_endings = np.where(edges == -1)[0]
# count lines with peaks on the lower side
lower_peaks = 0
for start, end in zip(line_starts, line_endings):
line = smooth[start:end]
if np.argmax(line) < len(line)/2:
lower_peaks += 1
print(lower_peaks / len(line_starts))
Run Code Online (Sandbox Code Playgroud)
this prints 0.125 for the given image, so this is not oriented correctly and must be flipped.
Note that this approach might break badly if there are images or anything not organized in lines in the image (maybe math or pictures). Another problem would be too few lines, resulting in bad statistics.
Also different fonts might result in different distributions. You can try this on a few images and see if the approach works. I don't have enough data.
| 归档时间: |
|
| 查看次数: |
2542 次 |
| 最近记录: |