读取xls文件的标头无效

dut*_*n79 8 java excel xls apache-poi

我正在本地系统上读一个excel文件.我正在使用POI jar版本3.7,但收到错误无效的标题签名; 读取-2300849302551019537或十六进制0xE011BDBFEFBDBFEF,预期为-2226271756974174256或十六进制0xE11AB1A1E011CFD0.

用Excel打开xls文件工作正常.

它发生的代码块:有人有想法吗?

/**
 * create a new HeaderBlockReader from an InputStream
 *
 * @param stream the source InputStream
 *
 * @exception IOException on errors or bad data
 */
public HeaderBlockReader(InputStream stream) throws IOException {
    // At this point, we don't know how big our
    //  block sizes are
    // So, read the first 32 bytes to check, then
    //  read the rest of the block
    byte[] blockStart = new byte[32];
    int bsCount = IOUtils.readFully(stream, blockStart);
    if(bsCount != 32) {
        throw alertShortRead(bsCount, 32);
    }

    // verify signature
    long signature = LittleEndian.getLong(blockStart, _signature_offset);

    if (signature != _signature) {
        // Is it one of the usual suspects?
        byte[] OOXML_FILE_HEADER = POIFSConstants.OOXML_FILE_HEADER;
        if(blockStart[0] == OOXML_FILE_HEADER[0] &&
            blockStart[1] == OOXML_FILE_HEADER[1] &&
            blockStart[2] == OOXML_FILE_HEADER[2] &&
            blockStart[3] == OOXML_FILE_HEADER[3]) {
            throw new OfficeXmlFileException("The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)");
        }
        if ((signature & 0xFF8FFFFFFFFFFFFFL) == 0x0010000200040009L) {
            // BIFF2 raw stream starts with BOF (sid=0x0009, size=0x0004, data=0x00t0)
            throw new IllegalArgumentException("The supplied data appears to be in BIFF2 format.  "
                    + "POI only supports BIFF8 format");
        }

        // Give a generic error
        throw new IOException("Invalid header signature; read "
                              + longToHex(signature) + ", expected "
                              + longToHex(_signature));
    }
Run Code Online (Sandbox Code Playgroud)

Fel*_*lix 17

只是一个想法,如果您使用maven,请确保在资源标记过滤中设置为false.否则maven会在复制阶段破坏xls文件


Gag*_*arr 13

该异常告诉您,您的文件不是有效的基于OLE2的.xls文件.

能够在Excel中打开文件并不是真正的指南 - 无论扩展名是什么,Excel都会愉快地打开它所知道的任何文件.如果您使用.csv文件并将其重命名为.xls,Excel仍然会打开它,但重命名并没有神奇地使它成为.xls格式,因此POI不会为您打开它.

如果您在Excel中打开文件并执行"另存为",它将允许您将其作为真实的Excel文件写出来.如果你想知道它究竟是什么文件,请尝试使用Apache Tika - Tika CLI --detect应该能够告诉你

.

我怎么能确定它不是有效的文件?如果您查看Microsoft 的OLE2文件格式规范文档,并转到2.2节,您将看到以下内容:

标头签名(8字节):复合文件结构的标识签名,必须设置为值0xD0,0xCF,0x11,0xE0,0xA1,0xB1,0x1A,0xE1.

将这些字节翻转(OLE2是小端)并获得0xE11AB1A1E011CFD0,这是异常中的幻数.您的文件不是以该幻数开头,因为实际上不是有效的OLE2文档,因此POI会为您提供该异常.