用python将多页pdf文件拆分成多个pdf文件?

mon*_*kut 51 python pdf

我想采用多页pdf文件,并在每页创建单独的pdf文件.

我已经下载了reportlab并浏览了文档,但它似乎是针对pdf生成的.我还没有看到任何关于自己处理PDF文件的事情.

在python中有一个简单的方法吗?

use*_*294 123

from PyPDF2 import PdfFileWriter, PdfFileReader

inputpdf = PdfFileReader(open("document.pdf", "rb"))

for i in range(inputpdf.numPages):
    output = PdfFileWriter()
    output.addPage(inputpdf.getPage(i))
    with open("document-page%s.pdf" % i, "wb") as outputStream:
        output.write(outputStream)
Run Code Online (Sandbox Code Playgroud)

等等

  • @ user26294:您应该更新代码以使用PyPDF2,它是pyPdf的主动维护后继者.只需用`from PyPDF2 import ...`替换`from pyPdf import ...`. (5认同)
  • 如果您希望文件的命名索引从 1 而不是 0 开始,请使用 `with open("document-page%s.pdf" % (i+1), "wb") as outputStream:`。 (3认同)

小智 7

我在这里错过了一个解决方案,您将 PDF 拆分为由所有页面组成的两部分,因此如果有人正在寻找相同的解决方案,我会附加我的解决方案:

from PyPDF2 import PdfFileWriter, PdfFileReader

def split_pdf_to_two(filename,page_number):
    pdf_reader = PdfFileReader(open(filename, "rb"))
    try:
        assert page_number < pdf_reader.numPages
        pdf_writer1 = PdfFileWriter()
        pdf_writer2 = PdfFileWriter()

        for page in range(page_number):
            pdf_writer1.addPage(pdf_reader.getPage(page))

        for page in range(page_number,pdf_reader.getNumPages()):
            pdf_writer2.addPage(pdf_reader.getPage(page))

        with open("part1.pdf", 'wb') as file1:
            pdf_writer1.write(file1)

        with open("part2.pdf", 'wb') as file2:
            pdf_writer2.write(file2)

    except AssertionError as e:
        print("Error: The PDF you are cutting has less pages than you want to cut!")
Run Code Online (Sandbox Code Playgroud)


Jer*_*her 6

更新了最新版本的 PyPDF (3.0.0) 的解决方案并分割一系列页面。

from PyPDF2 import PdfReader, PdfWriter

file_name = r'c:\temp\junk.pdf'
pages = (121, 130)

reader = PdfReader(file_name)
writer = PdfWriter()
page_range = range(pages[0], pages[1] + 1)

for page_num, page in enumerate(reader.pages, 1):
    if page_num in page_range:
        writer.add_page(page)

with open(f'{file_name}_page_{pages[0]}-{pages[1]}.pdf', 'wb') as out:
    writer.write(out)

Run Code Online (Sandbox Code Playgroud)


Nik*_*ain 5

PyPDF2 包使您能够将单个 PDF 拆分为多个 PDF。

import os
from PyPDF2 import PdfFileReader, PdfFileWriter

pdf = PdfFileReader(path)
for page in range(pdf.getNumPages()):
    pdf_writer = PdfFileWriter()
    pdf_writer.addPage(pdf.getPage(page))

    output_filename = '{}_page_{}.pdf'.format(fname, page+1)

    with open(output_filename, 'wb') as out:
        pdf_writer.write(out)

    print('Created: {}'.format(output_filename))
Run Code Online (Sandbox Code Playgroud)

来源:https : //www.blog.pythonlibrary.org/2018/04/11/splitting-and-merging-pdfs-with-python/