Sat*_*tor 7 windows python-3.x python-tesseract
我已经在我的 venv 中安装了 pytesseract 模块,并且想要从德语文件中提取文本
从 pytesseract执行此脚本并将语言设置为德语
import cv2
import pytesseract
try:
from PIL import Image
except ImportError:
import Image
print(pytesseract.image_to_string(Image.open('test.jpg')))
print(pytesseract.image_to_string(Image.open('test.jpg'), lang='ger'))
Run Code Online (Sandbox Code Playgroud)
这给了我
raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v3.05.00dev with Leptonica
Error opening data file C:\\Program Files (x86)\\Tesseract-OCR/tessdata/ger.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language \'ger\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
Run Code Online (Sandbox Code Playgroud)
我在 [tessdoc/Data-Files] ( https://github.com/tesseract-ocr/tessdoc/blob/master/Data-Files.md )上找到了语言数据
到目前为止,我只找到了 Linux 指南How do I install a new language pack for Tesseract on 16.04
我需要将 pyteseract sidepackage 中的语言文件移动到哪里才能使脚本正常工作?
| 归档时间: |
|
| 查看次数: |
15375 次 |
| 最近记录: |