Yas*_*rma 0 python pdf api docx file-conversion
我们如何使用/不使用 python 将 PDF 转换为 docx。实际上我想自动转换大量文件,所以我需要一个API。
我使用过在线网站,例如: https: //pdf2docx.com/
https://online2pdf.com/pdf2docx
https://www.zamzar.com/convert/pdf-to-docx/
我无法直接使用那里的 api
pdf2docx
安装
克隆或下载 pdf2docx
pip install pdf2docx
or
# download the package and install your environment
python setup.py install
Run Code Online (Sandbox Code Playgroud)
选项1
from pdf2docx import Converter
pdf_file = r'C:\Users\ABCD\Desktop\XYZ/Document1.pdf'# source file
docx_file = r'C:\Users\ABCD\Desktop\XYZ/sample.docx' # destination file
# convert pdf to docx
cv = Converter(pdf_file)
cv.convert(docx_file, start=0, end=None)
cv.close()
#Output
Parsing Page 53: 53/53...
Creating Page 53: 53/53...
--------------------------------------------------
Terminated in 6.258919400000195s.
Run Code Online (Sandbox Code Playgroud)
选项2
from pdf2docx import parse
pdf_file = r'C:\Users\ABCD\Desktop\XYZ/Document2.pdf' # source file
docx_file = r'C:\Users\ABCD\Desktop\XYZ/sample_2.docx' # destination file
# convert pdf to docx
parse(pdf_file, docx_file, start=0, end=None)
# output
Parsing Page 53: 53/53...
Creating Page 53: 53/53...
--------------------------------------------------
Terminated in 5.883666100000482s.
Run Code Online (Sandbox Code Playgroud)