IText Extract original from SIGNED PDF and compare HASH

Question

IText Extract original from SIGNED PDF and compare HASH

I have a signed PDF. The signature covers the entire documents and it's valid.

I want to extract the original pdf to compare its hash with that of the unsigned pdf.

I extract original pdf using the following code:

PdfReader reader = new PdfReader(FILESIGNED);
AcroFields acrofields = reader.getAcroFields();
//pdf have a unique signature
String signatureName = acrofields.getSignatureNames().get(0); 
FileOutputStream os = new FileOutputStream(FILEORIGINAL);
InputStream ip = acrofields.extractRevision(signatureName);
int n = 0;
byte bb[] = new byte[1028];
while ((n = ip.read(bb)) > 0)
    os.write(bb, 0, n);
os.close();
ip.close();
reader.close();

Run Code Online (Sandbox Code Playgroud)

But the extracted pdf is not the same as the original. I would extract revision before signature? Is it possible?

Thanks for help. Sara

Answer 1

mkl*_*mkl 6

我想提取原始 pdf 以将其哈希与未签名 pdf 的哈希进行比较。

一般来说这是不可能的。

当 iText（或其他 PDF 签名库或应用程序）签署 PDF 时，它们：

将签名表单字段添加到 PDF（除非存在空签名表单字段并选择用于签名）；
将字典对象添加到 PDF 中，其中包含一些与签名相关的条目，特别是一个大占位符条目，最终将在其中插入 CMS 签名容器；该字典被设置为上述表单字段的值；
将可视化添加到表单字段，通常包含签名者证书中的一些数据（除非选择签名不可见）；
如果带有字段锁定信息的字段的空签名被签名，则使某些其他表单字段为只读；
完成 PDF，即设置最后更改时间等元数据，然后将完成的 PDF 写入文件或某个字节数组；
计算完成的 PDF 的哈希值，不包括大占位符的值，但包括如上所述所做的所有其他更改；
对该哈希值进行签名，生成 CMS 签名容器；
并将此签名容器放入大占位符中。

因此，通常无法再从签名的 PDF 文件中提取“原始 pdf”，因为上述更改可能从根本上改变了 PDF 的内部结构。

但有一个例外：如果这些更改作为增量更新应用（用 iText 行话来说：在追加模式中），通常可以通过切断该增量更新来检索原始更新。

为此，只需在签名之前搜索最新的文件结束标记，然后将其切断。（实际上存在一些不安全因素，最终的行尾标记可能是也可能不是原始 PDF 的一部分。）

归档时间：	8 年，5 月前
查看次数：	2300 次
最近记录：	8 年，5 月前