读取流两次

War*_*zit 113 java inputstream

你怎么读两次相同的输入流?有可能以某种方式复制它吗?

我需要从网上获取图像,在本地保存,然后返回保存的图像.我只是认为使用相同的流而不是为下载的内容启动新流然后再次读取它会更快.

Pau*_*ime 99

您可以使用org.apache.commons.io.IOUtils.copy将InputStream的内容复制到字节数组,然后使用ByteArrayInputStream重复读取字节数组.例如:

ByteArrayOutputStream baos = new ByteArrayOutputStream();
org.apache.commons.io.IOUtils.copy(in, baos);
byte[] bytes = baos.toByteArray();

// either
while (needToReadAgain) {
    ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
    yourReadMethodHere(bais);
}

// or
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
while (needToReadAgain) {
    bais.reset();
    yourReadMethodHere(bais);
}
Run Code Online (Sandbox Code Playgroud)

  • 我知道这个评论已经过时了,但是,在第一个选项中,如果您将输入流读取为字节数组,是否意味着您将所有数据加载到内存中?如果你加载像大文件这样的东西,这可能是一个大问题? (24认同)
  • @Paul Grime:IOUtils.toByeArray也从内部调用copy方法. (3认同)
  • 正如@Ankit所说,这个解决方案对我来说无效,因为输入是内部读取的,不能重复使用. (3认同)
  • 可以使用 IOUtils.toByteArray(InputStream) 在一次调用中获取字节数组。 (3认同)
  • @jaxkodex,是的,没错。如果您作为开发人员更了解您正在处理的流的实际类型,那么您可以编写更合适的自定义行为。提供的答案是一般抽象。 (2认同)

Kev*_*ker 26

根据InputStream的来源,您可能无法重置它.您可以检查mark()reset()使用的支持markSupported().

如果是,您可以调用reset()InputStream返回到开头.如果没有,则需要再次从源读取InputStream.

  • @ayahuasca `InputStream` 子类,如 `BufferedInputStream` 确实支持“标记” (7认同)

小智 10

如果你的InputStream支持使用mark,那么你可以使用mark()inputStream然后reset()它.如果您InputStrem不支持mark,那么您可以使用该类java.io.BufferedInputStream,因此您可以将您的流嵌入到BufferedInputStream这样的内部

    InputStream bufferdInputStream = new BufferedInputStream(yourInputStream);
    bufferdInputStream.mark(some_value);
    //read your bufferdInputStream 
    bufferdInputStream.reset();
    //read it again
Run Code Online (Sandbox Code Playgroud)

  • 缓冲输入流只​​能标记回缓冲区大小,因此如果源不适合,则无法一直返回到开头。 (2认同)

wal*_*ros 7

您可以使用PushbackInputStream包装输入流.PushbackInputStream允许未读(" 写回 ")字节,这已经阅读,所以你可以这样做:

public class StreamTest {
  public static void main(String[] args) throws IOException {
    byte[] bytes = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

    InputStream originalStream = new ByteArrayInputStream(bytes);

    byte[] readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 1 2 3

    readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 4 5 6

    // now let's wrap it with PushBackInputStream

    originalStream = new ByteArrayInputStream(bytes);

    InputStream wrappedStream = new PushbackInputStream(originalStream, 10); // 10 means that maximnum 10 characters can be "written back" to the stream

    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 1 2 3

    ((PushbackInputStream) wrappedStream).unread(readBytes, 0, readBytes.length);

    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 1 2 3


  }

  private static byte[] getBytes(InputStream is, int howManyBytes) throws IOException {
    System.out.print("Reading stream: ");

    byte[] buf = new byte[howManyBytes];

    int next = 0;
    for (int i = 0; i < howManyBytes; i++) {
      next = is.read();
      if (next > 0) {
        buf[i] = (byte) next;
      }
    }
    return buf;
  }

  private static void printBytes(byte[] buffer) throws IOException {
    System.out.print("Reading stream: ");

    for (int i = 0; i < buffer.length; i++) {
      System.out.print(buffer[i] + " ");
    }
    System.out.println();
  }


}
Run Code Online (Sandbox Code Playgroud)

请注意,PushbackInputStream存储字节的内部缓冲区,因此它确实在内存中创建了一个缓冲区"保持写回"的缓冲区.

知道了这种方法,我们可以进一步将它与FilterInputStream结合起来.FilterInputStream将原始输入流存储为委托.这允许创建新的类定义为"允许未读 "的原始数据自动.这个类的定义如下:

public class TryReadInputStream extends FilterInputStream {
  private final int maxPushbackBufferSize;

  /**
  * Creates a <code>FilterInputStream</code>
  * by assigning the  argument <code>in</code>
  * to the field <code>this.in</code> so as
  * to remember it for later use.
  *
  * @param in the underlying input stream, or <code>null</code> if
  *           this instance is to be created without an underlying stream.
  */
  public TryReadInputStream(InputStream in, int maxPushbackBufferSize) {
    super(new PushbackInputStream(in, maxPushbackBufferSize));
    this.maxPushbackBufferSize = maxPushbackBufferSize;
  }

  /**
   * Reads from input stream the <code>length</code> of bytes to given buffer. The read bytes are still avilable
   * in the stream
   *
   * @param buffer the destination buffer to which read the data
   * @param offset  the start offset in the destination <code>buffer</code>
   * @aram length how many bytes to read from the stream to buff. Length needs to be less than
   *        <code>maxPushbackBufferSize</code> or IOException will be thrown
   *
   * @return number of bytes read
   * @throws java.io.IOException in case length is
   */
  public int tryRead(byte[] buffer, int offset, int length) throws IOException {
    validateMaxLength(length);

    // NOTE: below reading byte by byte instead of "int bytesRead = is.read(firstBytes, 0, maxBytesOfResponseToLog);"
    // because read() guarantees to read a byte

    int bytesRead = 0;

    int nextByte = 0;

    for (int i = 0; (i < length) && (nextByte >= 0); i++) {
      nextByte = read();
      if (nextByte >= 0) {
        buffer[offset + bytesRead++] = (byte) nextByte;
      }
    }

    if (bytesRead > 0) {
      ((PushbackInputStream) in).unread(buffer, offset, bytesRead);
    }

    return bytesRead;

  }

  public byte[] tryRead(int maxBytesToRead) throws IOException {
    validateMaxLength(maxBytesToRead);

    ByteArrayOutputStream baos = new ByteArrayOutputStream(); // as ByteArrayOutputStream to dynamically allocate internal bytes array instead of allocating possibly large buffer (if maxBytesToRead is large)

    // NOTE: below reading byte by byte instead of "int bytesRead = is.read(firstBytes, 0, maxBytesOfResponseToLog);"
    // because read() guarantees to read a byte

    int nextByte = 0;

    for (int i = 0; (i < maxBytesToRead) && (nextByte >= 0); i++) {
      nextByte = read();
      if (nextByte >= 0) {
        baos.write((byte) nextByte);
      }
    }

    byte[] buffer = baos.toByteArray();

    if (buffer.length > 0) {
      ((PushbackInputStream) in).unread(buffer, 0, buffer.length);
    }

    return buffer;

  }

  private void validateMaxLength(int length) throws IOException {
    if (length > maxPushbackBufferSize) {
      throw new IOException(
        "Trying to read more bytes than maxBytesToRead. Max bytes: " + maxPushbackBufferSize + ". Trying to read: " +
        length);
    }
  }

}
Run Code Online (Sandbox Code Playgroud)

这个类有两种方法.一个用于读入现有缓冲区(定义类似于调用public int read(byte b[], int off, int len)InputStream类).第二个返回新缓冲区(如果要读取的缓冲区大小未知,则可能更有效).

现在让我们看看我们的课程:

public class StreamTest2 {
  public static void main(String[] args) throws IOException {
    byte[] bytes = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

    InputStream originalStream = new ByteArrayInputStream(bytes);

    byte[] readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 1 2 3

    readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 4 5 6

    // now let's use our TryReadInputStream

    originalStream = new ByteArrayInputStream(bytes);

    InputStream wrappedStream = new TryReadInputStream(originalStream, 10);

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); // NOTE: no manual call to "unread"(!) because TryReadInputStream handles this internally
    printBytes(readBytes); // prints 1 2 3

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); 
    printBytes(readBytes); // prints 1 2 3

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3);
    printBytes(readBytes); // prints 1 2 3

    // we can also call normal read which will actually read the bytes without "writing them back"
    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 1 2 3

    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 4 5 6

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); // now we can try read next bytes
    printBytes(readBytes); // prints 7 8 9

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); 
    printBytes(readBytes); // prints 7 8 9


  }



}
Run Code Online (Sandbox Code Playgroud)


zeu*_*gor 7

用于将一个一InputStream分为二,同时避免将所有数据加载到内存中,然后独立处理它们:

  1. 创建几个OutputStream,准确地说:PipedOutputStream
  2. 将每个 PipedOutputStream 与 PipedInputStream 连接起来,这些PipedInputStream是返回的InputStream.
  3. 将源 InputStream 与刚刚创建的OutputStream. 因此,从源代码读取的所有内容都InputStream将同时写入OutputStream. 不需要实现它,因为它已经在TeeInputStream(commons.io) 中完成了。
  4. 在一个单独的线程中读取整个源 inputStream,并隐式地将输入数据传输到目标 inputStreams。

    public static final List<InputStream> splitInputStream(InputStream input) 
        throws IOException 
    { 
        Objects.requireNonNull(input);      
    
        PipedOutputStream pipedOut01 = new PipedOutputStream();
        PipedOutputStream pipedOut02 = new PipedOutputStream();
    
        List<InputStream> inputStreamList = new ArrayList<>();
        inputStreamList.add(new PipedInputStream(pipedOut01));
        inputStreamList.add(new PipedInputStream(pipedOut02));
    
        TeeOutputStream tout = new TeeOutputStream(pipedOut01, pipedOut02);
    
        TeeInputStream tin = new TeeInputStream(input, tout, true);
    
        Executors.newSingleThreadExecutor().submit(tin::readAllBytes);  
    
        return Collections.unmodifiableList(inputStreamList);
    }
    
    Run Code Online (Sandbox Code Playgroud)

注意消费后关闭 inputStreams,并关闭运行的线程: TeeInputStream.readAllBytes()

以防万一,您需要将其拆分为多个InputStream,而不仅仅是两个。在前面的代码片段中替换TeeOutputStream您自己实现的类,这将封装 aList<OutputStream>并覆盖OutputStream接口:

public final class TeeListOutputStream extends OutputStream {
    private final List<? extends OutputStream> branchList;

    public TeeListOutputStream(final List<? extends OutputStream> branchList) {
        Objects.requireNonNull(branchList);
        this.branchList = branchList;
    }

    @Override
    public synchronized void write(final int b) throws IOException {
        for (OutputStream branch : branchList) {
            branch.write(b);
        }
    }

    @Override
    public void flush() throws IOException {
        for (OutputStream branch : branchList) {
            branch.flush();
        }
    }

    @Override
    public void close() throws IOException {
        for (OutputStream branch : branchList) {
            branch.close();
        }
    }
}
Run Code Online (Sandbox Code Playgroud)


ala*_*inm 5

如果您正在使用 的实现InputStream,您可以检查该实现的结果InputStream#markSupported(),告诉您是否可以使用mark()/方法reset()

如果你可以在阅读时标记流,然后调用reset()返回开始。

如果你不能,你将不得不再次打开一个流。

另一种解决方案是将 InputStream 转换为字节数组,然后根据需要迭代数组。您可以在这篇文章中找到几种解决方案,无论是否使用 3rd 方库将 InputStream 转换为 Java 中的字节数组。注意,如果读取的内容太大,您可能会遇到一些内存问题。

最后,如果您需要读取图像,请使用:

BufferedImage image = ImageIO.read(new URL("http://www.example.com/images/toto.jpg"));
Run Code Online (Sandbox Code Playgroud)

UsingImageIO#read(java.net.URL)还允许您使用缓存。