java 8流:复杂的流处理

pio*_*rek 1 java java-8 java-stream

我想创建一个对流执行一些复杂操作的方法(例如替换第 7 个元素, 删除最后一个元素, 删除相邻的重复项等)而不缓存整个流。

但是什么流 api 让我插入这个方法?我是否必须创建自己的收集器,以便在收集时将项目发送到其他流?但这会改变数据流的方向,从拉到推,对吧?

这种方法的可能签名是什么?

Stream<T> process(Stream<T> in)
Run Code Online (Sandbox Code Playgroud)

可能是不可能的(在单线程代码中),因为只有在收集整个输入流后才能返回结果

另一个想法:

void process(Stream<T> in, Stream<T> out)
Run Code Online (Sandbox Code Playgroud)

似乎也有点缺陷,因为 java 不允许发出将项目插入现有流(作为out参数提供)。

那么我如何在java中进行一些复杂的流处理?

spr*_*ter 5

您用作示例的复杂操作都遵循根据流中的其他元素对流中的一个元素进行操作的模式。Java 流专门设计为不允许在没有集合或减少的情况下进行这些类型的操作。Streams 操作不允许直接访问其他成员,一般来说,具有副作用的非终端操作是一个坏主意。

请注意Streamjavadoc 中的以下内容:

集合和流虽然有一些表面上的相似之处,但有不同的目标。集合主要关注对其元素的有效管理和访问。相比之下,流不提供直接访问或操作其元素的方法,而是关注声明性地描述它们的源以及将在该源上聚合执行的计算操作。

进一步来说:

大多数流操作接受描述用户指定行为的参数......为了保持正确的行为,这些行为参数:

必须是无干扰的(它们不修改流源);并且在大多数情况下必须是无状态的(它们的结果不应依赖于在流管道执行期间可能发生变化的任何状态)。

Stream pipeline results may be nondeterministic or incorrect if the behavioral parameters to the stream operations are stateful. A stateful lambda (or other object implementing the appropriate functional interface) is one whose result depends on any state which might change during the execution of the stream pipeline

All the complexities of itermediate and terminal stateless and stateful operations are well described at https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html and http://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html

This approach has both advantages and disadvantages. A significant advantage is that it allows parallel processing of streams. A significant disadvantage is that operations that are easy in some other languages (such as skipping every third element in the stream) are difficult in Java.

Note that you will see a lot of code (including accepted answers on SO) that ignore the advice that behavioural parameters of stream operations should be stateless. To work, this code relies on behaviour of an implementation of Java that is not defined by the language specification: namely, that streams are processed in order. There is nothing in the specification stopping an implementation of Java processing elements in reverse order or random order. Such an implementation would make any stateful stream operations immediately behave differently. Stateless operations would continue to behave exactly the same. So, to summarise, stateful operations rely on details of the implementation of Java rather than the specification.

Also note that it is possible to have safe stateful intermediate operations. They need to be designed so that they specifically do not rely on the order in which elements are processed. Stream.distinct and Stream.sorted are good examples of this. They need to maintain state to work, but they are designed to work irrespective of the order in which elements are processed.

So to answer your question, these types of operations are possible to do in Java but they are not simple, safe (for the reason given in the previous paragraph) or a natural fit for the language design. I suggest using reduction or collection or (see Tagir Valeev's answer) a spliterator to create a new stream. Alternatively use traditional iteration.