如何将使用 .docx 保存的两个 Word 文档合并到第三个文件?

Pal*_*eep 2 java apache

我正在尝试合并两个文档让我们说文档 1:Merger1.doc 文档 2:Merger2.doc

我想将它存储到一个新文件 doc2.docx 中。

我已经使用这段代码来做到这一点,但它抛出了一些错误。

代码:

import java.io.*;
import org.apache.poi.hwpf.HWPFDocument; 
import org.apache.poi.hwpf.usermodel.CharacterRun;
import org.apache.poi.hwpf.usermodel.Range;

public class MergerFiles {

public static void main (String[] args) throws Exception {  
    // POI apparently can't create a document from scratch,  
    // so we need an existing empty dummy document  
    HWPFDocument doc = new HWPFDocument(new FileInputStream("C:\\Users\\pallavi123\\Desktop\\Merger1.docx"));  
    Range range = doc.getRange();  


    //I can get the entire Document and insert it in the tmp.doc  
    //However any formatting in my word document is lost.  
    HWPFDocument doc2 = new HWPFDocument(new FileInputStream("C:\\Users\\pallavi123\\Desktop\\Merger2.docx"));  
    Range range2 = doc2.getRange();  
    range.insertAfter(range2.text());  

    //I can get the information (text only) for each character run/paragraph or section.  
    //Again any formatting in my word document is lost.  
    HWPFDocument doc3 = new HWPFDocument(new FileInputStream("D:\\doc2.docx"));  
    Range range3 = doc3.getRange();  
    for(int i=0;i<range3.numCharacterRuns();i++){  
        CharacterRun run3 = range3.getCharacterRun(i);  
        range.insertAfter(run3.text());  
    }  

    OutputStream out = new FileOutputStream("D:\\result.doc");  
    doc.write(out);  
    out.flush();  
    out.close();  
}  
}  
Run Code Online (Sandbox Code Playgroud)

错误代码:

Exception in thread "main" org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)
at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:108)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:151)
at org.apache.poi.hwpf.HWPFDocument.verifyAndBuildPOIFS(HWPFDocument.java:120)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:133)
at MergerFiles.main(MergerFiles.java:11)
Run Code Online (Sandbox Code Playgroud)

我是否缺少任何 jar 文件或我使用代码的方式是错误的。需要您的宝贵建议。

提前致谢。

vic*_*107 5

我开发了下一个类:

import java.io.InputStream;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.List;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTBody;

public class WordMerge {

    private final OutputStream result;
    private final List<InputStream> inputs;
    private XWPFDocument first;

    public WordMerge(OutputStream result) {
        this.result = result;
        inputs = new ArrayList<>();
    }

    public void add(InputStream stream) throws Exception{            
        inputs.add(stream);
        OPCPackage srcPackage = OPCPackage.open(stream);
        XWPFDocument src1Document = new XWPFDocument(srcPackage);         
        if(inputs.size() == 1){
            first = src1Document;
        } else {            
            CTBody srcBody = src1Document.getDocument().getBody();
            first.getDocument().addNewBody().set(srcBody);            
        }        
    }

    public void doMerge() throws Exception{
        first.write(result);                
    }

    public void close() throws Exception{
        result.flush();
        result.close();
        for (InputStream input : inputs) {
            input.close();
        }
    }   
}
Run Code Online (Sandbox Code Playgroud)

及其用途:

public static void main(String[] args) throws Exception {

    FileOutputStream faos = new FileOutputStream("/home/victor/result.docx");

    WordMerge wm = new WordMerge(faos);

    wm.add( new FileInputStream("/home/victor/001.docx") );
    wm.add( new FileInputStream("/home/victor/002.docx") );

    wm.doMerge();
    wm.close();

}
Run Code Online (Sandbox Code Playgroud)