如何在Python中填写PDF表单?

Mar*_*zik 18 python forms pdf fill

如何用数据表单填充 PDF 文件并将其“压平”?

我现在使用 pdftk,但它不能正确处理国家字符。

是否有任何 Python 库或示例如何填写 PDF 表单并将其呈现为不可编辑的 PDF 文件?

Mar*_*oma 21

直接来自pypdf 文档(在问题提出几年后添加):

from pypdf import PdfReader, PdfWriter

reader = PdfReader("form.pdf")
writer = PdfWriter()

page = reader.pages[0]
fields = reader.get_fields()

writer.add_page(page)

writer.update_page_form_field_values(
    writer.pages[0], {"fieldname": "some filled in text"}
)

# write "output" to PyPDF2-output.pdf
with open("filled-out.pdf", "wb") as output_stream:
    writer.write(output_stream)
Run Code Online (Sandbox Code Playgroud)

  • 我是 PyPDF2 和 pypdf 的维护者。我将 PyPDF2 移回 pypdf。将来只有 pypdf 会收到新功能/错误修复。 (5认同)

Tyl*_*ian 12

尝试一下 fillpdf 库,它使这个过程非常简单(pip install fillpdf和 poppler 依赖conda install -c conda-forge poppler

基本用法:

from fillpdf import fillpdfs

fillpdfs.get_form_fields("blank.pdf")

# returns a dictionary of fields
# Set the returned dictionary values a save to a variable
# For radio boxes ('Off' = not filled, 'Yes' = filled)

data_dict = {
'Text2': 'Name',
'Text4': 'LastName',
'box': 'Yes',
}

fillpdfs.write_fillable_pdf('blank.pdf', 'new.pdf', data_dict)

# If you want it flattened:
fillpdfs.flatten_pdf('new.pdf', 'newflat.pdf')
Run Code Online (Sandbox Code Playgroud)

更多信息在这里: https: //github.com/t-houssian/fillpdf

看起来填充得非常好。

请参阅此处的答案以获取更多信息:/sf/answers/4676670491/

  • 这个图书馆非常适合我! (2认同)

Via*_*ech 1

您不需要一个库来压平 PDF,根据 Adob​​e Docs,您可以将可编辑表单字段的位位置更改为 1,以使该字段为只读。我在这里提供了完整的解决方案,但它使用 Django:

/sf/answers/3871126311/

Adobe 文档(第 441 页):

https://opensource.adobe.com/dc-acrobat-sdk-docs/standards/pdfstandards/pdf/PDF32000_2008.pdf

使用 PyPDF2 填充字段,然后循环注释以更改位位置:

from io import BytesIO
import PyPDF2
from PyPDF2.generic import BooleanObject, NameObject, IndirectObject, NumberObject

# open the pdf
input_stream = open("YourPDF.pdf", "rb")
pdf_reader = PyPDF2.PdfFileReader(input_stream, strict=False)
if "/AcroForm" in pdf_reader.trailer["/Root"]:
    pdf_reader.trailer["/Root"]["/AcroForm"].update(
        {NameObject("/NeedAppearances"): BooleanObject(True)})

pdf_writer = PyPDF2.PdfFileWriter()
set_need_appearances_writer(pdf_writer)
if "/AcroForm" in pdf_writer._root_object:
    # Acro form is form field, set needs appearances to fix printing issues
    pdf_writer._root_object["/AcroForm"].update(
        {NameObject("/NeedAppearances"): BooleanObject(True)})

data_dict = dict() # this is a dict of your DB form values

pdf_writer.addPage(pdf_reader.getPage(0))
page = pdf_writer.getPage(0)
# update form fields
pdf_writer.updatePageFormFieldValues(page, data_dict)
for j in range(0, len(page['/Annots'])):
    writer_annot = page['/Annots'][j].getObject()
    for field in data_dict:
        if writer_annot.get('/T') == field:
            writer_annot.update({
                NameObject("/Ff"): NumberObject(1)    # make ReadOnly
            })
output_stream = BytesIO()
pdf_writer.write(output_stream)

# output_stream is your flattened PDF


def set_need_appearances_writer(writer):
    # basically used to ensured there are not 
    # overlapping form fields, which makes printing hard
    try:
        catalog = writer._root_object
        # get the AcroForm tree and add "/NeedAppearances attribute
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)})

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
      

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
    
    return writer  
Run Code Online (Sandbox Code Playgroud)