RuntimeError:无法初始化 API,可能是无效的 tessdata 路径:<>

Yad*_* Sd 5 tesseract python-3.x

我使用的是windows操作系统。想要使用 tesserocr 的 fontAttributes 检测图像中的文本。但是当我运行 python 代码时,我收到此错误 - RuntimeError: 无法初始化 API,可能是无效的 tessdata 路径:C:\Program Files (x86)\Tesseract-OCR\tessdata/

i) 我已经安装了 -

tesseract-ocr-w32-setup-v5.0.0-alpha.20190623.exe
//(though my system is 64 bit)
Run Code Online (Sandbox Code Playgroud)

ii) 添加到路径变量(系统路径和用户路径)-

C:\Program Files (x86)\Tesseract-OCR
C:\Program Files (x86)\Tesseract-OCR\tessdata
Run Code Online (Sandbox Code Playgroud)

iii) 创建新的系统路径变量 - TESSDATA_PREFIX 和链接路径

tessdata  folder, like -
TESSDATA_PREFIX - C:\Program Files (x86)\Tesseract-OCR\tessdata


import pytesseract
import locale
locale.setlocale(locale.LC_ALL, 'C')

from tesserocr import PyTessBaseAPI, RIL, iterate_level,OEM


with PyTessBaseAPI(oem=OEM.TESSERACT_ONLY,lang='bask') as api:
    api.SetImageFile('sugar.png')

    api.Recognize()
    ri = api.GetIterator()
    level = RIL.WORD

    for r in iterate_level(ri, level):
        attrs = r.WordFontAttributes()
        symbol = r.GetUTF8Text(level)

        print(symbol,attrs)


 with PyTessBaseAPI(oem=OEM.TESSERACT_ONLY,lang='bask') as api:
 File "tesserocr.pyx", line 1168, in tesserocr._tesserocr.PyTessBaseAPI.__cinit
__
  File "tesserocr.pyx", line 1181, in tesserocr._tesserocr.PyTessBaseAPI._init_a
pi
RuntimeError: Failed to init API, possibly an invalid tessdata path: C:\Program
Files (x86)\Tesseract-OCR\tessdata/
Run Code Online (Sandbox Code Playgroud)

SUB*_*KHA 2

您的系统中可能没有 .traineddata 文件。你必须从复制它

C:\Program Files\Tesseract-OCR\tessdata

并将所有数据文件粘贴到您的目录中,我建议创建一个虚拟环境然后使用它