在base64 java中编码文件失败

Question

在base64 java中编码文件失败

我有这个类来编码和解码文件.当我使用.txt文件运行该类时,结果是成功的.但是当我用.jpg或.doc运行代码时,我无法打开文件,或者它不等于原始文件.我不知道为什么会这样.我修改了这个类 http://myjeeva.com/convert-image-to-string-and-string-to-image-in-java.html.但我想改变这一行

byte imageData[] = new byte[(int) file.length()];

Run Code Online (Sandbox Code Playgroud)

对于

byte example[] = new byte[1024];

Run Code Online (Sandbox Code Playgroud)

并多次读取我们需要的文件.谢谢.

import java.io.*;
import java.util.*;

  public class Encode {

Run Code Online (Sandbox Code Playgroud)

输入=输入文件根 - 输出=输出文件根 - imageDataString =字符串编码

  String input;
  String output;
  String imageDataString;


  public void setFileInput(String input){
    this.input=input;
  }

  public void setFileOutput(String output){
    this.output=output;
  }

  public String getFileInput(){
    return input;
  }

  public String getFileOutput(){
    return output;
  }

  public String getEncodeString(){
    return  imageDataString;
  }

  public String processCode(){
    StringBuilder sb= new StringBuilder();

    try{
        File fileInput= new File( getFileInput() );
        FileInputStream imageInFile = new FileInputStream(fileInput);

Run Code Online (Sandbox Code Playgroud)

我在例子中看到人们创建一个与文件长度相同的byte [].我不想要这个,因为我不知道该文件的长度.

        byte buff[] = new byte[1024];

        int r = 0;

        while ( ( r = imageInFile.read( buff)) > 0 ) {

          String imageData = encodeImage(buff);

          sb.append( imageData);

          if ( imageInFile.available() <= 0 ) {
            break;
          }
        }



       } catch (FileNotFoundException e) {
        System.out.println("File not found" + e);
      } catch (IOException ioe) {
        System.out.println("Exception while reading the file " + ioe);

    } 

        imageDataString = sb.toString();

       return imageDataString;
}  


  public  void processDecode(String str) throws IOException{

      byte[] imageByteArray = decodeImage(str);
      File fileOutput= new File( getFileOutput());
      FileOutputStream imageOutFile = new FileOutputStream( fileOutput);

      imageOutFile.write(imageByteArray);
      imageOutFile.close();

}

 public static String encodeImage(byte[] imageByteArray) {

      return  Base64.getEncoder().withoutPadding().encodeToString( imageByteArray);

    }

    public static byte[] decodeImage(String imageDataString) {
      return  Base64.getDecoder().decode(  imageDataString);  

    }


  public static void main(String[] args) throws IOException {

    Encode a = new Encode();

    a.setFileInput( "C://Users//xxx//Desktop//original.doc");
    a.setFileOutput("C://Users//xxx//Desktop//original-copied.doc");

    a.processCode( );

    a.processDecode( a.getEncodeString());

    System.out.println("C O P I E D");
  }
}

Run Code Online (Sandbox Code Playgroud)

我试过改变

String imageData = encodeImage(buff);

Run Code Online (Sandbox Code Playgroud)

对于

String imageData = encodeImage(buff,r);

Run Code Online (Sandbox Code Playgroud)

和方法encodeImage

public static String encodeImage(byte[] imageByteArray, int r) {

     byte[] aux = new byte[r];

     for ( int i = 0; i < aux.length; i++) {
       aux[i] = imageByteArray[i];

       if ( aux[i] <= 0 ) {
         break;
       }
     }
return  Base64.getDecoder().decode(  aux);
}

Run Code Online (Sandbox Code Playgroud)

但我有错误:

Exception in thread "main" java.lang.IllegalArgumentException: Last unit does not have enough valid bits

Run Code Online (Sandbox Code Playgroud)

Answer 1

Rea*_*tic 7

你的程序有两个问题.

第一个,正如@Joop Eggen所提到的,是你没有正确处理你的输入.

实际上,Java并不保证即使在文件的中间,你也会读取整个1024字节.它只能读取50个字节,并告诉它读取50个字节,然后下次再读取50个字节.

假设您在上一轮中读取了1024个字节.而现在,在本轮中,你只读50个.你的字节数组现在包含50个新字节,其余的是前一个读取的旧字节!

因此,您始终需要复制复制到新数组的确切字节数,并将其传递给编码函数.

因此,要解决此特定问题,您需要执行以下操作:

 while ( ( r = imageInFile.read( buff)) > 0 ) {

      byte[] realBuff = Arrays.copyOf( buff, r );

      String imageData = encodeImage(realBuff);

      ...
 }

Run Code Online (Sandbox Code Playgroud)

但是,这不是唯一的问题.你真正的问题在于Base64编码本身.

Base64所做的是取你的字节,将它们分成6位块,然后将每个块视为N 0和63之间的数字.然后它从字符表中取出第N个字符来表示该块.

但这意味着它不能只编码一个字节或两个字节,因为一个字节包含8位,这意味着一个6位的块和2个剩余位.两个字节有16位.这是2个6位的块,还有4个剩余的位.

要解决此问题,Base64始终编码3个连续字节.如果输入没有均匀地除以3,则会增加额外的零位.

这是一个演示问题的小程序:

package testing;

import java.util.Base64;

public class SimpleTest {

    public static void main(String[] args) {

        // An array containing six bytes to encode and decode.
        byte[] fullArray = { 0b01010101, (byte) 0b11110000, (byte)0b10101010, 0b00001111, (byte)0b11001100, 0b00110011 };

        // The same array broken into three chunks of two bytes.

        byte[][] threeTwoByteArrays = {
            {       0b01010101, (byte) 0b11110000 },
            { (byte)0b10101010,        0b00001111 },
            { (byte)0b11001100,        0b00110011 }
        };
        Base64.Encoder encoder = Base64.getEncoder().withoutPadding();

        // Encode the full array

        String encodedFullArray = encoder.encodeToString(fullArray);

        // Encode the three chunks consecutively 

        StringBuilder encodedStringBuilder = new StringBuilder();
        for ( byte [] twoByteArray : threeTwoByteArrays ) {
            encodedStringBuilder.append(encoder.encodeToString(twoByteArray));
        }
        String encodedInChunks = encodedStringBuilder.toString();

        System.out.println("Encoded full array: " + encodedFullArray);
        System.out.println("Encoded in chunks of two bytes: " + encodedInChunks);

        // Now  decode the two resulting strings

        Base64.Decoder decoder = Base64.getDecoder();

        byte[] decodedFromFull = decoder.decode(encodedFullArray);   
        System.out.println("Byte array decoded from full: " + byteArrayBinaryString(decodedFromFull));

        byte[] decodedFromChunked = decoder.decode(encodedInChunks);
        System.out.println("Byte array decoded from chunks: " + byteArrayBinaryString(decodedFromChunked));
    }

    /**
     * Convert a byte array to a string representation in binary
     */
    public static String byteArrayBinaryString( byte[] bytes ) {
        StringBuilder sb = new StringBuilder();
        sb.append('[');
        for ( byte b : bytes ) {
            sb.append(Integer.toBinaryString(Byte.toUnsignedInt(b))).append(',');
        }
        if ( sb.length() > 1) {
            sb.setCharAt(sb.length() - 1, ']');
        } else {
            sb.append(']');
        }
        return sb.toString();
    }
}

Run Code Online (Sandbox Code Playgroud)

所以,想象一下我的6字节数组是你的图像文件.并且假设您的缓冲区不是每次读取1024个字节而是2个字节.这将是编码的输出:

Encoded full array: VfCqD8wz
Encoded in chunks of two bytes: VfAqg8zDM

如您所见,完整数组的编码为我们提供了8个字符.每组三个字节被转换成4个6比特的块,然后转换成4个字符.

但是三个双字节数组的编码为您提供了一个包含9个字符的字符串.这是一个完全不同的字符串!通过用零填充将每组两个字节扩展为3个6比特的块.并且由于你没有要求填充,它只产生3个字符,没有额外的=通常标记,当字节数不能被3整除时.

解码8个字符,正确编码字符串的程序部分的输出很好:

Byte array decoded from full: [1010101,11110000,10101010,1111,11001100,110011]

但尝试解码9个字符的错误编码字符串的结果是:

Exception in thread "main" java.lang.IllegalArgumentException: Last unit does not have enough valid bits
    at java.util.Base64$Decoder.decode0(Base64.java:734)
    at java.util.Base64$Decoder.decode(Base64.java:526)
    at java.util.Base64$Decoder.decode(Base64.java:549)
    at testing.SimpleTest.main(SimpleTest.java:34)

不好!一个好的base64字符串应该总是有4个字符的倍数,我们只有9个.

由于您选择的缓冲区大小为1024(不是3的倍数),因此会出现问题.您需要每次编码3个字节的倍数以生成正确的字符串.所以实际上,你需要创建一个大小3072或类似的缓冲区.

但由于第一个问题,要小心传递给编码器的内容.因为总是会发生读取少于3072字节的事情.然后,如果数字不能被3整除,则会出现同样的问题.

归档时间：	10 年，4 月前
查看次数：	18300 次
最近记录：	10 年，4 月前