Luk*_*kas 8 java pdf-form pdfbox
如何使用PDFBox"展平"PDF表单(删除表单字段但保留字段文本)?
快速执行此操作的方法是从acrofrom中删除字段.
为此,您只需要获取文档目录,然后获取acroform,然后从此acroform中删除所有字段.
图形表示与注释链接并保留在文档中.
所以我写了这段代码:
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
import org.apache.pdfbox.pdmodel.interactive.form.PDAcroForm;
import org.apache.pdfbox.pdmodel.interactive.form.PDField;
public class PdfBoxTest {
public void test() throws Exception {
PDDocument pdDoc = PDDocument.load(new File("E:\\Form-Test.pdf"));
PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
PDAcroForm acroForm = pdCatalog.getAcroForm();
if (acroForm == null) {
System.out.println("No form-field --> stop");
return;
}
@SuppressWarnings("unchecked")
List<PDField> fields = acroForm.getFields();
// set the text in the form-field <-- does work
for (PDField field : fields) {
if (field.getFullyQualifiedName().equals("formfield1")) {
field.setValue("Test-String");
}
}
// remove form-field but keep text ???
// acroForm.getFields().clear(); <-- does not work
// acroForm.setFields(null); <-- does not work
// acroForm.setFields(new ArrayList()); <-- does not work
// ???
pdDoc.save("E:\\Form-Test-Result.pdf");
pdDoc.close();
}
}
Run Code Online (Sandbox Code Playgroud)
Syl*_*gat 16
使用PDFBox 2,现在可以通过调用对象flatten上的方法轻松地"展平"PDF表单PDAcroForm.请参阅Javadoc:PDAcroForm.flatten().
使用此方法调用示例的简化代码:
//Load the document
PDDocument pDDocument = PDDocument.load(new File("E:\\Form-Test.pdf"));
PDAcroForm pDAcroForm = pDDocument.getDocumentCatalog().getAcroForm();
//Fill the document
...
//Flatten the document
pDAcroForm.flatten();
//Save the document
pDDocument.save("E:\\Form-Test-Result.pdf");
pDDocument.close();
Run Code Online (Sandbox Code Playgroud)
注意:动态XFA表单不能展平.
要从PDFBox 1.*迁移到2.0,请查看官方迁移指南.
小智 7
setReadOnly为我工作,如下所示 -
@SuppressWarnings("unchecked")
List<PDField> fields = acroForm.getFields();
for (PDField field : fields) {
if (field.getFullyQualifiedName().equals("formfield1")) {
field.setReadOnly(true);
}
}
Run Code Online (Sandbox Code Playgroud)
这肯定是有效的 - 我遇到了这个问题,整夜调试,但终于弄明白了怎么做:)
这是假设你有能力来编辑以某种方式PDF /有过的PDF一些控制.
首先,使用Acrobat Pro编辑表单.将它们隐藏为只读.
然后你需要使用两个库:PDFBox和PDFClown.
PDFBox删除了告诉Adobe Reader它是一个表单的东西; PDFClown删除实际字段.必须首先完成PDFClown,然后是PDFBox(按顺序完成.反过来说不起作用).
单字段示例代码:
// PDF Clown code
File file = new File("Some file path");
Document document = file.getDocument();
Form form = file.getDocument.getForm();
Fields fields = form.getFields();
Field field = fields.get("some_field_name");
PageStamper stamper = new PageStamper();
FieldWidgets widgets = field.getWidgets();
Widget widget = widgets.get(0); // Generally is 0.. experiment to figure out
stamper.setPage(widget.getPage());
// Write text using text form field position as pivot.
PrimitiveComposer composer = stamper.getForeground();
Font font = font.get(document, "some_path");
composer.setFont(font, 10);
double xCoordinate = widget.getBox().getX();
double yCoordinate = widget.getBox().getY();
composer.showText("text i want to display", new Point2D.Double(xCoordinate, yCoordinate));
// Actually delete the form field!
field.delete();
stamper.flush();
// Create new buffer to output to...
Buffer buffer = new Buffer();
file.save(buffer, SerializationModeEnum.Standard);
byte[] bytes = buffer.toByteArray();
// PDFBox code
InputStream pdfInput = new ByteArrayInputStream(bytes);
PDDocument pdfDocument = PDDocument.load(pdfInput);
// Tell Adobe we don't have forms anymore.
PDDocumentCatalog pdCatalog = pdfDocument.getDocumentCatalog();
PDAcroForm acroForm = pdCatalog.getAcroForm();
COSDictionary acroFormDict = acroForm.getDictionary();
COSArray cosFields = (COSArray) acroFormDict.getDictionaryObject("Fields");
cosFields.clear();
// Phew. Finally.
pdfDocument.save("Some file path");
Run Code Online (Sandbox Code Playgroud)
可能在这里和那里有一些错别字,但这应该足以得到要点:)
小智 5
阅读有关pdf参考指南的内容后,我发现您可以通过添加值为1的“ Ff”键(字段标志)来轻松设置AcroForm字段的只读模式。这是有关此文档的内容:
如果设置,则用户不得更改该字段的值。任何关联的窗口小部件注释都不会与用户交互;也就是说,它们将不会响应鼠标单击或响应鼠标动作而更改其外观。此标志对于其值是从数据库计算或导入的字段很有用。
因此代码看起来像这样(使用pdfbox lib):
public static void makeAllWidgetsReadOnly(PDDocument pdDoc) throws IOException {
PDDocumentCatalog catalog = pdDoc.getDocumentCatalog();
PDAcroForm form = catalog.getAcroForm();
List<PDField> acroFormFields = form.getFields();
System.out.println(String.format("found %d acroFrom fields", acroFormFields.size()));
for(PDField field: acroFormFields) {
makeAcroFieldReadOnly(field);
}
}
private static void makeAcroFieldReadOnly(PDField field) {
field.getDictionary().setInt("Ff",1);
}
Run Code Online (Sandbox Code Playgroud)
这是来自 PDFBox-Mailinglist 的 Thomas 的回答:
您需要通过 COSDictionary 获取字段。试试这个代码...
PDDocument pdDoc = PDDocument.load(new File("E:\\Form-Test.pdf"));
PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
PDAcroForm acroForm = pdCatalog.getAcroForm();
COSDictionary acroFormDict = acroForm.getDictionary();
COSArray fields = acroFormDict.getDictionaryObject("Fields");
fields.clear();
Run Code Online (Sandbox Code Playgroud)