Abt*_*ian 26 python ocr tesseract image-processing python-tesseract
我正在使用python-tesseract从图像中提取单词.这是tesseract的python包装器,它是一个OCR代码.
我使用以下代码来获取单词:
import tesseract
api = tesseract.TessBaseAPI()
api.Init(".","eng",tesseract.OEM_DEFAULT)
api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyz")
api.SetPageSegMode(tesseract.PSM_AUTO)
mImgFile = "test.jpg"
mBuffer=open(mImgFile,"rb").read()
result = tesseract.ProcessPagesBuffer(mBuffer,len(mBuffer),api)
print "result(ProcessPagesBuffer)=",result
Run Code Online (Sandbox Code Playgroud)
这仅返回图像中的单词而不是它们的位置/大小/方向(或者换句话说,包含它们的边界框).我想知道是否有任何方法可以实现这一点
stw*_*ykd 39
使用 pytesseract.image_to_data()
import pytesseract
from pytesseract import Output
import cv2
img = cv2.imread('image.jpg')
d = pytesseract.image_to_data(img, output_type=Output.DICT)
n_boxes = len(d['level'])
for i in range(n_boxes):
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.imshow('img', img)
cv2.waitKey(0)
Run Code Online (Sandbox Code Playgroud)
在返回的数据中pytesseract.image_to_data():
left 是从边界框的左上角到图像左边界的距离。top 是从边界框的左上角到图像顶部的距离。width和height是边框的宽度和高度。conf是模型在该边界框中预测单词的置信度。如果conf为-1,则表示相应的边界框包含一个文本块,而不只是一个单词。pytesseract.image_to_boxes()包围字母返回的包围盒,所以我相信这pytesseract.image_to_data()是您要寻找的。
len*_*310 15
tesseract.GetBoxText() method返回数组中每个字符的确切位置.
此外,还有一个命令行选项tesseract test.jpg result hocr,它将生成一个result.html文件,其中包含每个已识别的单词的坐标.但我不确定它是否可以通过python脚本调用.
jtb*_*tbr 10
Python tesseract可以使用以下image_to_boxes函数在不写入文件的情况下执行此操作:
import cv2
import pytesseract
filename = 'image.png'
# read the image and get the dimensions
img = cv2.imread(filename)
h, w, _ = img.shape # assumes color image
# run tesseract, returning the bounding boxes
boxes = pytesseract.image_to_boxes(img) # also include any config options you use
# draw the bounding boxes on the image
for b in boxes.splitlines():
b = b.split(' ')
img = cv2.rectangle(img, (int(b[1]), h - int(b[2])), (int(b[3]), h - int(b[4])), (0, 255, 0), 2)
# show annotated image and wait for keypress
cv2.imshow(filename, img)
cv2.waitKey(0)
Run Code Online (Sandbox Code Playgroud)
小智 6
使用下面的代码,您可以获得与每个字符对应的边界框.
import csv
import cv2
from pytesseract import pytesseract as pt
pt.run_tesseract('bw.png', 'output', lang=None, boxes=True, config="hocr")
# To read the coordinates
boxes = []
with open('output.box', 'rb') as f:
reader = csv.reader(f, delimiter = ' ')
for row in reader:
if(len(row)==6):
boxes.append(row)
# Draw the bounding box
img = cv2.imread('bw.png')
h, w, _ = img.shape
for b in boxes:
img = cv2.rectangle(img,(int(b[1]),h-int(b[2])),(int(b[3]),h-int(b[4])),(255,0,0),2)
cv2.imshow('output',img)
Run Code Online (Sandbox Code Playgroud)