Cal*_*rey 52 java sorting parallel-processing
我正在阅读有关Java 和的区别的问题,这已经有几年了。令我惊讶的是,只有一个问题提到了使用的任何缺点; 也就是说,如果您使用大量CPU,则速度会降低。Arrays.sort
Arrays.parallelSort
parallelSort
假设您不在某种专门的单线程环境中,应该总是选择一个parallelSort
吗?有没有理由不这样做?请注意,上述问题的答案之一是,如果少于4096个元素,则无论如何都会parallelSort
调用sort
。
roo*_*eee 55
使用有一些缺点 Arrays.parallelSort
ForkJoinPool.commonPool()
和,并且会与默认情况下使用它的其他函数(例如parallel()
,在流上)抗争Arrays.parallelSort
使用是不可配置的(仅在全局级别上通过增加公共池线程数量)ArrayList
在整个生命周期内都为空,这可以节省大量的内存和CPU时间,无需实例化永远不会填充的数组)还有一个有趣的场景:说说您是否实施了一些需要分类的纸牌游戏。它非常容易并行地并行执行多个游戏执行,而不是并行执行一次运行的排序机制,而排序机制可能只占整个游戏循环的一小部分。您失去了立即并行化的简便方法(例如,在遗传算法的背景下运行游戏时)。
但是,是的,如果碰巧有大数组,排序是应用程序运行时使用的重要组成部分Arrays.parallelSort
。
编辑:即使Arrays.parallelSort
给定数组少于4096个元素,即使切换到常规排序:都是为了显示意图-如果可能,您需要一个并行排序,它具有与调用不同的含义sort
。而且要挑剔:它确实在小型阵列上表现较差,因为它必须进行额外检查,以检查该阵列是否包含少于4096个元素,以及其他一些有关公共池线程数的检查(当然,其开销可以忽略不计):) 。
This is not much different than the question when to use stream()
vs parallelStream()
- it depends on how much data you have. Of course the majority of time, when sorting 10 elements in parallel, will be consumed by the threading framework that is under the hood (which is un-specified by the documentation), not by the sorting itself.
But you also have to wonder why such methods are introduced IMO. Hardware is moving (has already moved?) towards many CPU's, not more GHz
, so doing things in parallel is only a normal course for any language that wants to still be alive in the next 20 years.
As to how much data you need to actually be performant for parallelSort
as opposed to sort
, plus knowing that we need at least MIN_ARRAY_SORT_GRAN + 1
to gain any potential benefit; writing a proper test to prove that for this particular set-up and run, you would need at least X
numbers, is not that complicated. You also have to take into account that some arrays might be already sorted (explained further), while some might be totally un-sorted (5,4,3,2,1
for example), this brings some penalties for the second one.
Taking some random data and making a test:
@Warmup(iterations = 10)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
public class ParallelSort {
public static void main(String[] args) throws Exception {
Options opt = new OptionsBuilder()
.include(ParallelSort.class.getName())
.build();
new Runner(opt).run();
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
public int[] parallel(ParallelSortExecutionPlan plan) {
Arrays.parallelSort(plan.ints());
return plan.ints();
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
public int[] nonParallel(ParallelSortExecutionPlan plan) {
Arrays.sort(plan.ints());
return plan.ints();
}
}
@State(Scope.Benchmark)
public class ParallelSortExecutionPlan {
@Param(value = {"10", "100", "1000", "10000", "100000", "1000000"})
private int howMany;
private int[] ints;
public static void main(String[] args) {
}
@Setup(Level.Invocation)
public void setUp() {
ints = new int[howMany];
for (int i = 0; i < howMany; ++i) {
ints[i] = ThreadLocalRandom.current().nextInt();
}
}
int[] ints() {
return ints;
}
}
Run Code Online (Sandbox Code Playgroud)
Just notice that the second class is using @Setup(Level.Invocation)
(if you know a bit of JMH) - this is a very sharp tool here; but I uses it because I want an un-sorted array for each Invocation
of the method. As otherwise, if Trial
would have been used for example - only the first call would be an un-sorted array, all other calls of the @Benhcmark
method would already be sorted. For the fun of it, you could change that single line to @Setup(Level.Trial)
for example and see the results, they will make very little sense.
Running this reveals:
Benchmark (howMany) Mode Cnt Score Error Units
ParallelSort.nonParallel 10 avgt 2 128.847 ns/op
ParallelSort.parallel 10 avgt 2 116.656 ns/op
ParallelSort.nonParallel 100 avgt 2 1956.746 ns/op
ParallelSort.parallel 100 avgt 2 1963.335 ns/op
ParallelSort.nonParallel 1000 avgt 2 32162.611 ns/op
ParallelSort.parallel 1000 avgt 2 31716.915 ns/op
ParallelSort.nonParallel 10000 avgt 2 423531.663 ns/op
ParallelSort.parallel 10000 avgt 2 201802.609 ns/op
ParallelSort.nonParallel 100000 avgt 2 6503511.987 ns/op
ParallelSort.parallel 100000 avgt 2 1363169.661 ns/op
ParallelSort.nonParallel 1000000 avgt 2 69058738.586 ns/op
ParallelSort.parallel 1000000 avgt 2 13469112.930 ns/op
Run Code Online (Sandbox Code Playgroud)
Pretty much a very expected output to me.
不,对于足够小的阵列我会说不。设置线程的开销不会导致明显的加速。
关键是“够小”。所有问题的答案都不会相同。
决不应该应用教条,除非是在这条教条规则的情况下。就像我们唯一不应该容忍的就是不容忍。那里有一个波普尔悖论。
归档时间: |
|
查看次数: |
3332 次 |
最近记录: |