我正在尝试使用pdf2image,看来我需要一个叫做propeller:
(sum_env) C:\Users\antoi\Documents\Programming\projects\summarizer>python ocr.py -i fr13_idf.pdf
Traceback (most recent call last):
File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 165, in __page_count
proc = Popen(["pdfinfo", pdf_path], stdout=PIPE, stderr=PIPE)
File "C:\Python37\lib\subprocess.py", line 769, in __init__
restore_signals, start_new_session)
File "C:\Python37\lib\subprocess.py", line 1172, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "ocr.py", line 53, in <module>
pdfspliterimager(image_path)
File "ocr.py", line 32, in pdfspliterimager
pages = convert_from_path("document-page%s.pdf" % i, 500)
File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 30, in convert_from_path
page_count = __page_count(pdf_path, userpw)
File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 169, in __page_count
raise Exception('Unable to get page count. Is poppler installed and in PATH?')
Exception: Unable to get page count. Is poppler installed and in PATH?
Run Code Online (Sandbox Code Playgroud)
我尝试了此链接,但是下载东西并没有解决我的问题。
小智 13
所有下载的第一个poppler的从这里这里,然后将解压后it.In代码段只需添加poppler_path = R'C:\ Program Files文件\ poppler的-0.68.0 \ BIN'(用于如)如下图所示
from pdf2image import convert_from_path
images = convert_from_path("mypdf.pdf", 500,poppler_path=r'C:\Program Files\poppler-0.68.0\bin')
for i, image in enumerate(images):
fname = 'image'+str(i)+'.png'
image.save(fname, "PNG")
Run Code Online (Sandbox Code Playgroud)
现在它完成了。有了这个技巧,不需要添加环境变量。如果您有任何问题,请告诉我。
dat*_*ght 13
使用pdf2image时,需要满足一些依赖关系:
安装pdf2image
pip安装pdf2image
安装 python-dateutil
pip 安装 python-dateutil
Poppler的安装
在环境变量中指定 Poppler 路径(系统路径)
pages = convert_from_path(filepath, poppler_path=r"actualpoppler_path")
Run Code Online (Sandbox Code Playgroud)
小智 8
这些 pdf2image 和 pdftotext 库后端要求是 Poppler,所以你必须安装
'conda install -c conda-forge poppler'
那么错误将得到解决。如果它仍然不适合您,那么您可以按照 http://blog.alivate.com.au/poppler-windows/安装此库。
小智 6
对于窗户;解决PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?:
chocolatey https://chocolatey.org/installchoco install poppler| 归档时间: |
|
| 查看次数: |
14393 次 |
| 最近记录: |