Java MS Word 和 PDF 转换为图像 (png/jpg)

Pet*_*ski 3 java image-processing

我正在寻找一个免费的库来从 MS Word、WordPerfect 和 PDF 转换为图像。有没有人知道任何好的和最新的 JAVA 库?

Ati*_*dhe 5

将 PDF 转换为图像您可以使用PDFbox

以下是使用pdfbox api将PDF转换为图像的代码

import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import javax.imageio.ImageIO;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageNode;


public List<String> generateImages(String pdfFile) throws IOException  {

     String imagePath = "/Users/$user/pdfimages/";
     List <String> fileNames = new ArrayList<String>();
      document = PDDocument.load(pdfFile);  //// load pdf   
      node = document.getDocumentCatalog().getPages(); ///// get pages
      List<PDPage> kids = node.getKids();
      int count=0;
      for(PDPage page : kids) {   ///// iterate
           BufferedImage img = page.convertToImage(BufferedImage.TYPE_INT_RGB,128);
           File imageFile = new File(imagePath+ count++ + ".jpg");
               ImageIO.write(img, "jpg", imageFile);
               fileNames.add(imageFile.getName());     
           }
           return fileNames;   
    }
Run Code Online (Sandbox Code Playgroud)

另一个库ApachePOI可用于将 PDF 转换为图像

这是代码示例

import java.awt.Color;
import java.awt.Dimension;
import java.awt.Graphics2D;
import java.awt.RenderingHints;
import java.awt.geom.Rectangle2D;
import java.awt.image.BufferedImage;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.hslf.model.Slide;
import org.apache.poi.hslf.usermodel.SlideShow;

public class JavaApplication12 {

public static void main(String[] args) throws FileNotFoundException, IOException {
FileInputStream is = new FileInputStream(“D:/Presentation1.ppt”);
SlideShow ppt = new SlideShow(is);
is.close();

Dimension pgsize = ppt.getPageSize();

Slide[] slide = ppt.getSlides();
for (int i = 0; i < slide.length; i++) {

BufferedImage img = new BufferedImage(pgsize.width, pgsize.height, 1);

Graphics2D graphics = img.createGraphics();
graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
graphics.setRenderingHint(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY);
graphics.setRenderingHint(RenderingHints.KEY_INTERPOLATION,
RenderingHints.VALUE_INTERPOLATION_BICUBIC);
graphics.setRenderingHint(RenderingHints.KEY_FRACTIONALMETRICS,
RenderingHints.VALUE_FRACTIONALMETRICS_ON);

graphics.setColor(Color.white);
graphics.clearRect(0, 0, pgsize.width, pgsize.height);
graphics.fill(new Rectangle2D.Float(0, 0, pgsize.width, pgsize.height));

// render
slide[i].draw(graphics);

// save the output
FileOutputStream out = new FileOutputStream(“slide-” + (i + 1) + “.png”);
javax.imageio.ImageIO.write(img, “png”, out);
out.close();
}
}
}
Run Code Online (Sandbox Code Playgroud)

要将MS Word转换为图像,您可以查看此处发布的问题 其中使用JODConverter

JODConverter 自动执行 OpenOffice.org 支持的所有转换,包括

  • 任何格式到 PDF o OpenDocument(文本、电子表格、演示文稿)到 PDF o Word 到 PDF;Excel转PDF;PowerPoint 转 PDF 或 RTF 转 PDF;WordPerfect 转 PDF;...
  • 还有更多 o OpenDocument Presentation (odp) 到 Flash;PowerPoint 到 Flash o RTF 到 OpenDocument;WordPerfect 到 OpenDocument o 任何格式到 HTML(有限制) o 支持 OpenOffice.org 1.0 和旧的 StarSuite 格式