Apache POI - 读取修改excel文件

MJa*_*Jar 4 java apache-poi

每当我使用Apatche POI打开excel文件时,文件都会被修改,即使我只是在阅读文件而没有进行任何修改.

以此类测试代码为例.

public class ApachePoiTest {

    @Test
    public void readingShouldNotModifyFile() throws Exception {
        final File testFile = new File("C:/work/src/test/resources/Book2.xlsx");
        final byte[] originalChecksum = calculateChecksum(testFile);
        Assert.assertTrue("Calculating checksum modified file",
            MessageDigest.isEqual(originalChecksum, calculateChecksum(testFile)));
        try (Workbook wb = WorkbookFactory.create(testFile)) {
            Assert.assertNotNull("Reading file with Apache POI", wb);
        }
        Assert.assertTrue("Reading file with Apache POI modified file",
            MessageDigest.isEqual(originalChecksum, calculateChecksum(testFile)));
    }

    @Test
    public void readingInputStreamShouldNotModifyFile() throws Exception {
        final File testFile = new File("C:/work/src/test/resources/Book2.xlsx");
        final byte[] originalChecksum = calculateChecksum(testFile);
        Assert.assertTrue("Calculating checksum modified file",
            MessageDigest.isEqual(originalChecksum, calculateChecksum(testFile)));
        try (InputStream is = new FileInputStream(testFile); Workbook wb = WorkbookFactory.create(is)) {
            Assert.assertNotNull("Reading file with Apache POI", wb);
        }
        Assert.assertTrue("Reading file with Apache POI modified file",
            MessageDigest.isEqual(originalChecksum, calculateChecksum(testFile)));
    }

    private byte[] calculateChecksum(final File file) throws Exception {
        final MessageDigest md = MessageDigest.getInstance("MD5");
        md.reset();
        try (InputStream is = new FileInputStream(file)) {
            final byte[] bytes = new byte[2048];
            int numBytes;
            while ((numBytes = is.read(bytes)) != -1) {
                md.update(bytes, 0, numBytes);
            }
            return md.digest();
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

测试readingShouldNotModifyFile始终失败,因为Apache POI始终修改该文件.在使用MS Office新创建的空白excel文件上进行测试时,Apache POI会将文件从8.1 kb切换到6.2 kb并破坏文件.

经测试:

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.15</version>
</dependency>
Run Code Online (Sandbox Code Playgroud)

以及版本3.12

我可以通过其他方式阻止Apache POI修改我的文件,InputStream而不是通过File.我不想通过,InputStream因为我担心Apache警告它需要更多内存并且有一些特定要求InputStream.

Gag*_*arr 7

您的问题是您没有传入readonly标志,因此Apache POI默认为打开文件读/写.

您需要使用重载的WorkbookFactory.create方法,该方法将readonly标志 + set设置为readonly标志为true

改变线

try (InputStream is = new FileInputStream(testFile); Workbook wb = WorkbookFactory.create(is)) {
Run Code Online (Sandbox Code Playgroud)

try (IWorkbook wb = WorkbookFactory.create(testFile,null,true)) {
Run Code Online (Sandbox Code Playgroud)

并且您的文件将以只读方式打开而不进行任何更改