我有一个PDF,我使用PDFBox从中提取了一个页面:
(...)
File input = new File("C:\\temp\\sample.pdf");
document = PDDocument.load(input);
List allPages = document.getDocumentCatalog().getAllPages();
PDPage page = (PDPage) allPages.get(2);
PDStream contents = page.getContents();
if (contents != null) {
System.out.println(contents.getInputStreamAsString());
(...)
Run Code Online (Sandbox Code Playgroud)
根据PDF规范,这给出了以下结果,看起来像您期望的结果.
q
/GS0 gs
/Fm0 Do
Q
/Span <</Lang (en-US)/MCID 88 >>BDC
BT
/CS0 cs 0 0 0 scn
/GS1 gs
/T1_0 1 Tf
8.5 0 0 8.5 70.8661 576 Tm
(This page has been intentionally left blank.)Tj
ET
EMC
1 1 1 scn
/GS0 gs
22.677 761.102 …Run Code Online (Sandbox Code Playgroud)