我目前正在用Java创建自己的持久性数组,该数组使用二进制搜索树存储值的集合。
我想添加一个以a Function
作为参数的map方法来生成一个新数组。除非请求特定值,否则我不希望评估这些功能。这是很简单的事情,因为lambdas是惰性评估的。但是,我也只希望函数只被评估一次,即使多次请求结果也是如此。
我可以创建一个存储供应商的节点,并在评估时更新结果:
class Node<T> {
private T value;
private Supplier<T> supplier;
public T get() {
if (null != value)
return value;
value = supplier.get();
return value;
}
}
Run Code Online (Sandbox Code Playgroud)
... where supplier
是从Function
应用于持久化数组旧版本中的值得出的。
但是,这不再是一种功能性方法,并且有可能在多线程系统中引起错误*。在供应商返回空值**的情况下,这也不会产生任何优势。
另一种方法是Node
在get调用中返回的实例:
class Node<T> {
private final Optional<T> value;
private final Supplier<T> supplier;
Node(Supplier<T> supplier, T value) {
this.supplier = supplier;
this.value = Optional.ofNullable(value);
}
public Tuple<Node<T>, T> get() {
if (null != value)
return new Tuple<>(this, value.orElse(null));
T result = supplier.get();
Node<T> newNode = new Node<>(null, result);
return new Tuple<>(newNode, result);
}
}
Run Code Online (Sandbox Code Playgroud)
我喜欢这种保持功能的方法;但是,仅通过get调用,在树上的所有父节点中都将需要大量开销。而且这将需要在使用应用程序的代码中进行繁琐的拆箱。
有谁能想到其他方法来按照我的要求进行这项工作?谢谢。
*这可以使用锁定机制解决,但会增加一层我希望避免的复杂性。
**我已经考虑过制作value
一个Optional<T>
,其中的null
均值尚未被评估,而Optional.empty()
as已被评估并返回null。但是,这只能解决我的问题,而不是解决问题。
对于任何不熟悉持久化数组的人来说,它都是一种数据结构,每当执行更新时,它都会创建其自身的新实例。这使其完全不可变。使用二叉树(或更常见的32位树)方法可以使更新减少速度和内存上的重复数据。
编辑:
集合的代码可以在github上找到。有关使用说明,请查看测试文件夹。
Disclaimer: this answer doesn't respond the question directly, as it doesn't use neither Supplier
nor Optional
directly in the Node
class. Instead, a generic functional programming technique is presented that might help solve the problem.
If the problem is all about evaluating the function only once for each input value, then you shouldn't change your tree/array/nodes. Instead, memoize the function, which is a pure functional approach:
In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again
Here's a way to do it, inspired by this excellent article written by Pierre-Yves Saumont (please check it for an in-depth introduction to memoization):
public static <T, U> Function<T, U> memoize(Function<T, U> function) {
Map<T, U> cache = new ConcurrentHashMap<>();
return input -> cache.computeIfAbsent(input, function::apply);
}
Run Code Online (Sandbox Code Playgroud)
Suppose you have a method that takes quite long to execute. Then, you could use the memoize
method this way:
// This method takes quite long to execute
Integer longCalculation(Integer x) {
try {
Thread.sleep(1_000);
} catch (InterruptedException ignored) {
}
return x * 2;
}
// Our function is a method reference to the method above
Function<Integer, Integer> function = this::longCalculation;
// Now we memoize the function declared above
Function<Integer, Integer> memoized = memoize(function);
Run Code Online (Sandbox Code Playgroud)
Now, if you call:
int result1 = function.apply(1);
int result2 = function.apply(2);
int result3 = function.apply(3);
int result4 = function.apply(2);
int result5 = function.apply(1);
Run Code Online (Sandbox Code Playgroud)
You'll notice that the five calls take ~5 seconds altogether (1 second for each call).
However, if you use the memoized
function with the same input values 1 2 3 2 1
:
int memoizedResult1 = memoized.apply(1);
int memoizedResult2 = memoized.apply(2);
int memoizedResult3 = memoized.apply(3);
int memoizedResult4 = memoized.apply(2); // <-- returned from cache
int memoizedResult5 = memoized.apply(1); // <-- returned from cache
Run Code Online (Sandbox Code Playgroud)
You'll notice that now the five calls take ~3 seconds altogether. This is because the last two results are immediately returned from the cache.
So, back to your structure... Inside your map
method, you could just memoize the given function and use the returned memoized function instead. Internally, this will cache the function's return values in the ConcurrentHashMap
.
As the memoize
method uses a ConcurrentHashMap
internally, you don't need to worry about concurrency.
注意:这只是开始......我正在考虑两个可能的改进。一种方法是限制缓存的大小,这样如果给定函数的域太大,它就不会占用整个内存。另一个改进是仅在先前未记忆给定函数时才记忆该函数。但这些细节留给读者作为练习......;)
归档时间: |
|
查看次数: |
528 次 |
最近记录: |