如何从java中的文件中获取幻数

use*_*506 3 java file-extension magic-numbers

我有来自 UploadedFile 按钮的文件,我想使用幻数打印扩展文件,

我的代码:

UploadedFile file = (UploadedFile)valueChangeEvent.getNewValue();
byte[] fileByteArray = IOUtils.toByteArray(file.getInputStream());
Run Code Online (Sandbox Code Playgroud)

注意:Mime 类型和内容文件(来自文件和来自文件名)与幻数不同(幻数来自输入流的第一个字节)

我该怎么做?

Lor*_*nzo 12

我提出我的解决方案,以防万一人们想要一个没有 java-servlet 相关代码的替代方案:

public enum MagicBytes {
    PNG(0x89, 0x50),  // Define just like previous answer 
    JPG(0xFF, 0xD8),
    PDF(0x25, 0x50);
    
    private final int[] magicBytes;
    
    private MagicBytes(int...bytes) {
        magicBytes = bytes;
    }
    
    // Checks if bytes match a specific magic bytes sequence
    public boolean is(byte[] bytes) {
        if (bytes.length != magicBytes.length)
            throw new RuntimeException("I need the first "+magicBytes.length
                    + " bytes of an input stream.");
        for (int i=0; i<bytes.length; i++)
            if (Byte.toUnsignedInt(bytes[i]) != magicBytes[i])
                return false;
        return true;
    }
    
    // Extracts head bytes from any stream
    public static byte[] extract(InputStream is, int length) throws IOException {
        try (is) {  // automatically close stream on return
            byte[] buffer = new byte[length];
            is.read(buffer, 0, length);
            return buffer;
        }
    }
    
    /* Convenience methods */
    public boolean is(File file) throws IOException {
        return is(new FileInputStream(file));
    }
    
    public boolean is(InputStream is) throws IOException {
        return is(extract(is, magicBytes.length));
    }
}
Run Code Online (Sandbox Code Playgroud)

然后根据您是否有文件或 InputStream 像这样调用:

MagicBytes.PNG.is(new File("picture.png"))
MagicBytes.PNG.is(new FileInputStream("picture.png"))
Run Code Online (Sandbox Code Playgroud)

作为枚举还允许我们在需要时通过使用循环遍历每种格式MagicBytes.values()

编辑:我之前放置的代码是我用于自己的库的实际枚举的简化版本,但使用之前的答案进行了调整,以帮助人们更快地理解。但是,某些文件格式可能具有不同类型的标头,因此如果这是您的特定用例的问题,则此类会更合适:gist


Dew*_* MN 7

我知道这是一个老问题,只需将我的答案放在这里,希望有人在搜索相同的解决方案时发现它很有用。

import java.io.File;
import java.io.IOException;
import java.io.InputStream;

import javax.servlet.ServletContext;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.Part;

import javax.servlet.annotation.MultipartConfig;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

@MultipartConfig(
    fileSizeThreshold = 0,
    maxFileSize = 1024 * 1024 * 50,       // 50MB
    maxRequestSize = 1024 * 1024 * 100)   // 100MB
public class FileUpload extends HttpServlet {    

    private static final Logger logger = LogManager.getLogger(FileUpload.class);
    private byte[] data = new byte[4];

    public void doPost(HttpServletRequest request, HttpServletResponse response)
        throws IOException, ServletException {

        response.setContentType("text/plain");
        response.setCharacterEncoding("UTF-8");

        try {
            fileSignature(request
              .getPart("image_file")
              .getInputStream());
        } catch (IOException | NullPointerException ex) {
            logger.error(ex);
        }

        String fileType = getFileType(data);

        // return the recognized type 
        response.getWriter().write(fileType);
    }

    /**
     * Get the first 4 byte of a file file signature. 
     * 
     * @param part File from part.
     */
     private void fileSignature(InputStream is)
             throws IOException, NullPointerException {
         is.read(data, 0, 4);
     }

     /**
      * Get the file type based on the file signature.
      * Here restricted to only recognized file type jpeg, jpg, png and
      * pdf where the signature of jpg and jpeg files are the same.
      *
      * @param fileData Byte array of the file.
      * @return String of the file type.
      */
     private String getFileType(byte[] fileData) {
         String type = "undefined";
         if(Byte.toUnsignedInt(fileData[0]) == 0x89 && Byte.toUnsignedInt(fileData[1]) == 0x50)
             type = "png";
         else if(Byte.toUnsignedInt(fileData[0]) == 0xFF && Byte.toUnsignedInt(fileData[1]) == 0xD8)
             type = "jpg";
         else if(Byte.toUnsignedInt(fileData[0]) == 0x25 && Byte.toUnsignedInt(fileData[1]) == 0x50)
             type = "pdf";

        return type;
    }
}
Run Code Online (Sandbox Code Playgroud)

文件幻数参考: