如何获取 tesseract 读取的线坐标?

akm*_*kms 1 tesseract computer-vision python-tesseract

有没有办法用超正方体逐行读取图像并获取线的坐标?通常我可以读取 tesseract 返回字典中的每个单词,并且可以获得所有位置,但没有线坐标选项?我正在使用 psm 6 逐行读取,但即使我使用它,我也会收到单词的坐标

d = pytesseract.image_to_data(img, lang="eng", output_type=Output.DICT)
Run Code Online (Sandbox Code Playgroud)

fla*_*ite 5

您可以将属于每一行的单词分组在一起,并从最左边和最右边的单词边界框找到行边界框。下面是 python 实现,用于将每行的单词分组在一起。

text = pytesseract.image_to_data(img, lang="eng", output_type=Output.DICT)

data = {}
for i in range(len(text['line_num'])):
    txt = text['text'][i]
    block_num = text['block_num'][i]
    line_num = text['line_num'][i]
    top, left = text['top'][i], text['left'][i]
    width, height = text['width'][i], text['height'][i]
    if not (txt == '' or txt.isspace()):
        tup = (txt, left, top, width, height)
        if block_num in data:
            if line_num in data[block_num]:
                data[block_num][line_num].append(tup)
            else:
                data[block_num][line_num] = [tup]
        else:
            data[block_num] = {}
            data[block_num][line_num] = [tup]

linedata = {}
idx = 0
for _, b  in data.items():
    for _, l in b.items():
        linedata[idx] = l
        idx += 1
line_idx = 1
for _, line in linedata.items():
     xmin, ymin = line[0][1], line[0][2]
     xmax, ymax = (line[-1][1] + line[-1][3]), (line[-1][2] + line[-1][4])
     print("Line {} : {}, {}, {}, {}".format(line_idx, xmin, ymin, xmax, ymax))
     line_idx += 1
Run Code Online (Sandbox Code Playgroud)