iText 2.1.7 PdfCopy.addPage(page) 找不到页面引用？

Question

iText 2.1.7 PdfCopy.addPage(page) 找不到页面引用？

我正在维护一个使用 iText 2.1.7 创建 PDF 的 Web 应用程序。我想获取现有 PDF 的内容并将其放入代码正在创建的 pdf 文档中。我有以下内容（编辑：更完整的代码）：

package itexttest;

import com.lowagie.text.Document;
import com.lowagie.text.PageSize;
import com.lowagie.text.Paragraph;
import com.lowagie.text.pdf.PdfCopy;
import com.lowagie.text.pdf.PdfImportedPage;
import com.lowagie.text.pdf.PdfReader;
import com.lowagie.text.pdf.PdfWriter;
import java.io.ByteArrayOutputStream;
import java.io.OutputStream;

public class ITextTest 
{
    public static void main(String[] args) 
    {
        try
        {
            ByteArrayOutputStream os = new ByteArrayOutputStream();
            Document bigDoc = new Document(PageSize.LETTER, 50, 50, 110, 60);
            PdfWriter writer = PdfWriter.getInstance(bigDoc, os);
            bigDoc.open();

            Paragraph par = new Paragraph("one");
            bigDoc.add(par);
            bigDoc.add(new Paragraph("three"));

            addPdfPage(bigDoc, os, "c:/insertable.pdf");

            bigDoc.close();
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
    }

    private static void addPdfPage(Document document, OutputStream outputStream, String location) {
        try {

            PdfReader pdfReader = new PdfReader(location);
            int pages = pdfReader.getNumberOfPages();

            PdfCopy pdfCopy = new PdfCopy(document, outputStream);
            PdfImportedPage page = pdfCopy.getImportedPage(pdfReader, 1);
            pdfCopy.addPage(page);
        }
        catch (Exception e) {
            System.out.println("Cannot add PDF from PSC: <" + location + ">: " + e.getMessage());
            e.printStackTrace();
        }
    }

}

Run Code Online (Sandbox Code Playgroud)

这会引发错误， null from PdfWriter.getPageReference()。

我如何错误地使用这个？如何从现有文档中获取页面并将其放入当前文档中？请注意，我所处的位置根本不方便将文件作为临时存储或其他内容写入。

Answer 1

mkl*_*mkl 5

我不再积极使用旧的 iText 版本，但从那时起有些事情没有改变。因此，您的代码中存在一些问题以及有助于解决这些问题的指针：

当前代码中的主要问题是

重用该Document实例（您已在其中使用PdfWriter且已打开）PdfCopy；虽然 aDocument可以支持多个监听器，但它们都需要在调用之前注册open；此构造的用例是以两种不同的格式并行创建同一文档；你呢
PdfWriter对您和您的使用相同的输出流PdfCopy；结果不是一个有效的 PDF，而是两个不同 PDF 的字节范围疯狂地混合在一起，即肯定不是有效的 PDF。

`PdfCopy`正确使用

您可以通过首先创建一个新的 PDF 来重组您的代码，其中包含新的段落ByteArrayOutputStream（关闭Document相关内容），然后复制此 PDF 和您想要添加到新 PDF 中的其他页面。

例如这样：

ByteArrayOutputStream os = new ByteArrayOutputStream();
Document bigDoc = new Document(PageSize.LETTER, 50, 50, 110, 60);
PdfWriter writer = PdfWriter.getInstance(bigDoc, os);
bigDoc.open();
Paragraph par = new Paragraph("one");
bigDoc.add(par);
bigDoc.add(new Paragraph("three"));
bigDoc.close();

ByteArrayOutputStream os2 = new ByteArrayOutputStream();
Document finalDoc = new Document();
PdfCopy copy = new PdfCopy(finalDoc, new FileOutputStream(RESULT2));
finalDoc.open();
PdfReader reader = new PdfReader(os.toByteArray());
for (int i = 0; i < reader.getNumberOfPages();) {
    copy.addPage(copy.getImportedPage(reader, ++i));
}
PdfReader pdfReader = new PdfReader("c:/insertable.pdf");
copy.addPage(copy.getImportedPage(pdfReader, 1));
finalDoc.close();
reader.close();
pdfReader.close();

// result PDF
byte[] result = os2.toByteArray();

Run Code Online (Sandbox Code Playgroud)

仅使用`PdfWriter`

您也可以通过直接将页面导入到您的中来更改代码PdfWriter，例如：

ByteArrayOutputStream os = new ByteArrayOutputStream();
Document bigDoc = new Document(PageSize.LETTER, 50, 50, 110, 60);
PdfWriter writer = PdfWriter.getInstance(bigDoc, os);
bigDoc.open();
Paragraph par = new Paragraph("one");
bigDoc.add(par);
bigDoc.add(new Paragraph("three"));

PdfReader pdfReader = new PdfReader("c:/insertable.pdf");
PdfImportedPage page = writer.getImportedPage(pdfReader, 1);
bigDoc.newPage();
PdfContentByte canvas = writer.getDirectContent();
canvas.addTemplate(page, 1, 0, 0, 1, 0, 0);

bigDoc.close();
pdfReader.close();

// result PDF
byte[] result = os.toByteArray();

Run Code Online (Sandbox Code Playgroud)

这种方法看起来更好，因为不需要中间 PDF。不幸的是这种外表具有欺骗性，这种方法也有一些缺点。

这里，并不是将整个原始页面按原样复制并添加到文档中，而是仅将其内容流用作模板的内容，然后从实际的新文档页面引用该模板。这尤其意味着：

如果导入页面的尺寸与新目标文档的尺寸不同，则其某些部分可能会被剪切，而新页面的某些部分仍为空。因此，您经常会发现上面代码的变体，它们通过缩放和旋转尝试使导入的页面和目标页面适合。
原始页面内容现在位于新页面引用的模板中。如果您使用相同的机制将此新页面导入到另一个文档中，您将得到一个引用模板的页面，而该模板又仅引用具有原始内容的模板。如果将此页面导入到另一个文档中，则会获得另一个级别的间接性。等等等等。

不幸的是，符合标准的 PDF 查看器只需要在有限的程度上支持这种间接性。如果继续此过程，您的页面内容可能会突然不再可见。如果原始页面已经带来了自己的引用模板层次结构，那么这种情况可能会发生得更早而不是更晚。
由于仅复制内容，因此不在内容流中的原始页面的属性将丢失。这尤其涉及注释，例如表单字段或某些类型的突出显示标记，甚至某些类型的自由文本。

（顺便说一句，这些模板在通用 PDF 规范术语中称为Form XObjects。）

这个答案明确涉及合并 PDF 的使用PdfCopy和PdfWriter合并 PDF 的上下文。

归档时间：	10 年前
查看次数：	4131 次
最近记录：	10 年前

iText 2.1.7 PdfCopy.addPage(page) 找不到页面引用？

PdfCopy正确使用

仅使用PdfWriter

`PdfCopy`正确使用

仅使用`PdfWriter`