RCv*_*ram 18 python algorithm opencv image-processing computer-vision
tesseract OCR 文本识别的最大挑战之一是图像的不均匀照明。我需要一种算法来确定图像是否包含不均匀的照明。
我附上了no illumination image,glare image( white-spotted image)和的图像shadow containing image。如果我们给算法一个图像,算法应该分为两类
no illumination image将属于这一类。glare image( white-spotted image), shadow containing image将属于这一类。无照明图像 - A 类
不均匀照明图像(眩光图像(白斑图像))B 类
不均匀照明图像(包含图像的阴影)B 类
将色彩空间更改为 HSV
HSV值通道直方图分析识别光照不均。
代替前两步,我们可以使用感知亮度通道代替HSV的值通道
设置低阈值以获取小于低阈值的像素数
设置高阈值以获取高于高阈值的像素数
低像素值百分比和高像素值百分比检测不均匀闪电情况(百分比设置阈值)
但我找不到不均匀照明图像之间的大相似之处。我刚刚通过直方图分析发现有些像素的值较低,而有些像素的值较高。
基本上我的感觉是,如果在低阈值中设置一些阈值并找出有多少像素小于低阈值,并设置一些高阈值以找出有多少像素大于该阈值。有了像素数,我们可以得出结论来检测图像中不均匀的闪电条件吗?这里我们需要敲定两个阈值和像素数的百分比来得出结论。
def show_hist_v(img_path):
img = cv2.imread(img_path)
hsv_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
h,s,v = cv2.split(hsv_img)
histr =cv2.calcHist(v, [0], None, [255],[0,255])
plt.plot(histr)
plt.show()
low_threshold =np.count_nonzero(v < 50)
high_threshold =np.count_nonzero(v >200)
total_pixels = img.shape[0]* img.shape[1]
percenet_low =low_threshold/total_pixels*100
percenet_high =high_threshold/total_pixels*100
print("Total Pixels - {}\n Pixels More than 200 - {} \n Pixels Less than 50 - {} \n Pixels percentage more than 200 - {} \n Pixel spercentage less than 50 - {} \n".format(total_pixels,high_threshold,low_threshold,percenet_low,percenet_high))
return total_pixels,high_threshold,low_threshold,percenet_low,percenet_high
Run Code Online (Sandbox Code Playgroud)
那么有人可以改进我的初始方法或提供比这种方法更好的方法来检测一般情况下图像中的不均匀照明吗?
另外,我尝试了感知亮度而不是值通道,因为值通道采用 (b,g,r) 值的最大值,我认为感知亮度是一个不错的选择
def get_perceive_brightness( float_img):
float_img = np.float64(float_img) # unit8 will make overflow
b, g, r = cv2.split(float_img)
float_brightness = np.sqrt(
(0.241 * (r ** 2)) + (0.691 * (g ** 2)) + (0.068 * (b ** 2)))
brightness_channel = np.uint8(np.absolute(float_brightness))
return brightness_channel
def show_hist_v(img_path):
img = cv2.imread(img_path)
v = get_perceive_brightness(img)
histr =cv2.calcHist(v, [0], None, [255],[0,255])
plt.plot(histr)
plt.show()
low_threshold =np.count_nonzero(v < 50)
high_threshold =np.count_nonzero(v >200)
total_pixels = img.shape[0]* img.shape[1]
percenet_low =low_threshold/total_pixels*100
percenet_high =high_threshold/total_pixels*100
print("Total Pixels - {}\n Pixels More than 200 - {} \n Pixels Less than 50 - {} \n Pixels percentage more than 200 - {} \n Pixel spercentage less than 50 - {} \n".format(total_pixels,high_threshold,low_threshold,percenet_low,percenet_high))
return total_pixels,high_threshold,low_threshold,percenet_low,percenet_high
Run Code Online (Sandbox Code Playgroud)
感知亮度通道直方图分析
正如艾哈迈德建议的那样。
def get_percentage_of_binary_pixels(img=None, img_path=None):
if img is None:
if img_path is not None:
gray_img = cv2.imread(img_path, 0)
else:
return "No img or img_path"
else:
print(img.shape)
if len(img.shape) > 2:
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
else:
gray_img = img
h, w = gray_img.shape
guassian_blur = cv2.GaussianBlur(gray_img, (5, 5), 0)
thresh_value, otsu_img = cv2.threshold(guassian_blur, 0, 255,
cv2.THRESH_BINARY + cv2.THRESH_OTSU)
cv2.imwrite("binary/{}".format(img_path.split('/')[-1]), otsu_img)
black_pixels = np.count_nonzero(otsu_img == 0)
# white_pixels = np.count_nonzero(otsu_img == 255)
black_pixels_percentage = black_pixels / (h * w) * 100
# white_pixels_percentage = white_pixels / (h * w) * 100
return black_pixels_percentage
Run Code Online (Sandbox Code Playgroud)
当我们通过 otsu 二值化获得超过 35% 的黑色像素百分比时,我们可以检测到 80% 左右的不均匀照明图像。当光照发生在图像的小区域时,检测失败。
提前致谢
为什么不删除图像中的闪电效果呢?
\n例如:
\n\n如果我们想读取pytesseract输出将是\' \\n\\f\'
import cv2\nimport pytesseract\n\nimg = cv2.imread(\'img2.jpg\')\ngray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)\nsmooth = cv2.GaussianBlur(gray, (95, 95), 0)\ndivision = cv2.divide(gray, smooth, scale=192)\nRun Code Online (Sandbox Code Playgroud)\n\npytesseract,输出的某些部分将是:.\n.\n.\nDosage & use\nSee package insert for compicic\ninformation,\n\nInstruction:\nKeep all medicines out of the re.\nRead the instructions carefully\n\nStorage:\nStore at temperature below 30\xc2\xb0C.\nProtect from Heat, light & moisture. BATCH NO. : 014C003\nMFG. DATE - 03-2019\n\n\xe2\x80\x94\xe2\x80\x94 EXP. DATE : 03-2021\n\nGENIX Distributed\nAS Exclusi i :\ngenx PHARMA PRIVATE LIMITED Cevoka Pv 2 A \xe2\x80\x98<\n\xc2\xbb 45-B, Kore ci\nKarachi-75190, | Pakisier al Pei yaa fans\nwww.genixpharma.com\nRun Code Online (Sandbox Code Playgroud)\npytesseract,输出的某些部分将是:.\n.\n.\nDosage & use\nSee package insert for complete prescribing\ninformation. Rx Only\n\nInstruction:\nKeep all medicines out of the reach of children.\nRead the instructions carefully before using.\n\nStorage:\n\nStore at temperature below 30\xc2\xb0C. 5\n\nProtect from Neat, light & moisture. BATCH NO, : 0140003\nMFG. DATE : 03-2019\nEXP. DATE : 03-2021\n\nManufactured by:\n\nGENI N Exclusively Distributed by:\nGENIX PHARMA PRIVATE LIMITED Ceyoka (Pvt) Ltd.\n\n44, 45-B, Korangi Creek Road, 55, Negombe Road,\nKarachi-75190, Pakistan. Peliyagoda, Snianka,\n\nwww. genixpharma.com\nRun Code Online (Sandbox Code Playgroud)\n更新
\nerode您可以使用和方法找到被照亮的部分dilatation。
结果:
\n\n代码:
\nimport cv2\nimport imutils\nimport numpy as np\nfrom skimage import measure\nfrom imutils import contours\n\nimg = cv2.imread(\'img2.jpg\')\ngray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)\nblurred = cv2.GaussianBlur(gray, (95, 95), 0)\nthresh = cv2.threshold(blurred, 200, 255, cv2.THRESH_BINARY)[1]\nthresh = cv2.erode(thresh, None, iterations=2)\nthresh = cv2.dilate(thresh, None, iterations=4)\nlabels = measure.label(thresh, neighbors=8, background=0)\nmask = np.zeros(thresh.shape, dtype="uint8")\nfor label in np.unique(labels):\n if label == 0:\n continue\n labelMask = np.zeros(thresh.shape, dtype="uint8")\n labelMask[labels == label] = 255\n numPixels = cv2.countNonZero(labelMask)\n if numPixels > 300:\n mask = cv2.add(mask, labelMask)\n\n cnts = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL,\n cv2.CHAIN_APPROX_SIMPLE)\n cnts = imutils.grab_contours(cnts)\n cnts = contours.sort_contours(cnts)[0]\n for (i, c) in enumerate(cnts):\n (x, y, w, h) = cv2.boundingRect(c)\n ((cX, cY), radius) = cv2.minEnclosingCircle(c)\n cv2.circle(img, (int(cX), int(cY)), int(radius),\n (0, 0, 255), 3)\n cv2.putText(img, "#{}".format(i + 1), (x, y - 15),\n cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 0, 255), 2)\n cv2.imshow("Image", img)\n cv2.waitKey(0)\nRun Code Online (Sandbox Code Playgroud)\n虽然我只用第二张图片进行了测试。您可能需要更改其他图像的参数。
\n我建议使用除法技巧将文本与背景分开,然后仅计算背景的统计数据。设置一些合理的阈值后,很容易为照明创建分类器。
def get_image_stats(img_path, lbl):
img = cv2.imread(img_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (25, 25), 0)
no_text = gray * ((gray/blurred)>0.99) # select background only
no_text[no_text<10] = no_text[no_text>20].mean() # convert black pixels to mean value
no_bright = no_text.copy()
no_bright[no_bright>220] = no_bright[no_bright<220].mean() # disregard bright pixels
print(lbl)
std = no_bright.std()
print('STD:', std)
bright = (no_text>220).sum()
print('Brigth pixels:', bright)
plt.figure()
plt.hist(no_text.reshape(-1,1), 25)
plt.title(lbl)
if std>25:
print("!!! Detected uneven illumination")
if no_text.mean()<200 and bright>8000:
print("!!! Detected glare")
Run Code Online (Sandbox Code Playgroud)
这导致:
good_img
STD: 11.264569863071165
Brigth pixels: 58
glare_img
STD: 15.00149131296984
Brigth pixels: 15122
!!! Detected glare
uneven_img
STD: 57.99510339944441
Brigth pixels: 688
!!! Detected uneven illumination
Run Code Online (Sandbox Code Playgroud)
现在让我们分析直方图并应用一些常识。我们期望背景均匀且方差较低,就像“good_img”中的情况一样。如果方差高,那么标准差就高,就是亮度不均匀的情况。在下图中,您可以看到 3 个(较小的)峰,它们负责 3 个不同的照明区域。中间最大的峰值是将所有黑色像素设置为平均值的结果。我相信将 STD 高于 25 的图像称为“照明不均匀”情况是安全的。
当有眩光时,很容易发现大量明亮像素(见右图)。除了热点之外,眩光图像看起来不错。将亮像素的阈值设置为 8000(图像总大小的 1.5%)之类的值应该可以很好地检测此类图像。有可能背景到处都很亮,所以如果no_text像素的平均值高于200,那么就是这种情况,不需要检测热点。