LinkedList与ArrayList在维护有序列表时的性能

Mee*_*ary 14 java sorting linked-list arraylist

我想维持一个List<Integer>大小<= 10 ^ 6 的有序.每次添加一个新元素时,我都会调用Collections.sort()方法对列表中的新元素进行排序.据我所知,ArrayList表现要好于LinkedList.但是由于我会sort()经常调用方法,所以我已经明白,linkedList在对列表进行排序时会表现得更好并且将是更好的选择ArrayList,因为没有像ArrayList(array作为底层数据结构使用)的情况那样移动元素.任何更有效的建议.

ass*_*ias 17

您可以Collections#binarySearch在排序列表上使用以查找正确的插入点.ArrayList可能比LinkedList表现更好,特别是对于大小的大小,但这很容易测试.

我运行了各种方法的微基准测试:在每次插入后使用排序或使用binarySearch插入正确的位置,使用ArrayList(AL)和LinkedList(LL).我还添加了Commons TreeList和番石榴的TreeMultiset.

结论

  • 测试中最好的算法是使用TreeMultiset,但严格来说它不是一个列表 - 下一个最好的选择是使用ArrayList+ binarySearch
  • 在所有情况下,ArrayList的性能都优于LinkedList,后者需要几分钟才能完成100,000个元素(ArrayList只需不到一秒钟).

表现最佳的代码,供参考:

@Benchmark public ArrayList<Integer> binarySearchAL() {
  ArrayList<Integer> list = new ArrayList<> ();

  Random r = new Random();
  for (int i = 0; i < n; i++) {
    int num = r.nextInt();
    int index = Collections.binarySearch(list, num);
    if (index >= 0) list.add(index, num);
    else list.add(-index - 1, num);
    current = list.get(0); //O(1), to make sure the sort is not optimised away
  }
  return list;
}
Run Code Online (Sandbox Code Playgroud)

bitbucket上的完整代码.

完整的结果

"Benchmark"列包含被测方法的名称(baseLine只填充列表而不对其进行排序,其他方法具有显式名称:AL = ArrayList,LL = LinkedList,TL = Commons TreeList,treeMultiSet = guava),(n )是列表的大小,Score是以毫秒为单位的时间.

Benchmark                            (n)  Mode  Samples     Score     Error  Units
c.a.p.SO28164665.baseLine            100  avgt       10     0.002 ±   0.000  ms/op
c.a.p.SO28164665.baseLine           1000  avgt       10     0.017 ±   0.001  ms/op
c.a.p.SO28164665.baseLine           5000  avgt       10     0.086 ±   0.002  ms/op
c.a.p.SO28164665.baseLine          10000  avgt       10     0.175 ±   0.007  ms/op
c.a.p.SO28164665.binarySearchAL      100  avgt       10     0.014 ±   0.001  ms/op
c.a.p.SO28164665.binarySearchAL     1000  avgt       10     0.226 ±   0.006  ms/op
c.a.p.SO28164665.binarySearchAL     5000  avgt       10     2.413 ±   0.125  ms/op
c.a.p.SO28164665.binarySearchAL    10000  avgt       10     8.478 ±   0.523  ms/op
c.a.p.SO28164665.binarySearchLL      100  avgt       10     0.031 ±   0.000  ms/op
c.a.p.SO28164665.binarySearchLL     1000  avgt       10     3.876 ±   0.100  ms/op
c.a.p.SO28164665.binarySearchLL     5000  avgt       10   263.717 ±   6.852  ms/op
c.a.p.SO28164665.binarySearchLL    10000  avgt       10   843.436 ±  33.265  ms/op
c.a.p.SO28164665.sortAL              100  avgt       10     0.051 ±   0.002  ms/op
c.a.p.SO28164665.sortAL             1000  avgt       10     3.381 ±   0.189  ms/op
c.a.p.SO28164665.sortAL             5000  avgt       10   118.882 ±  22.030  ms/op
c.a.p.SO28164665.sortAL            10000  avgt       10   511.668 ± 171.453  ms/op
c.a.p.SO28164665.sortLL              100  avgt       10     0.082 ±   0.002  ms/op
c.a.p.SO28164665.sortLL             1000  avgt       10    13.045 ±   0.460  ms/op
c.a.p.SO28164665.sortLL             5000  avgt       10   642.593 ± 188.044  ms/op
c.a.p.SO28164665.sortLL            10000  avgt       10  1182.698 ± 159.468  ms/op
c.a.p.SO28164665.binarySearchTL      100  avgt       10    0.056 ±  0.002  ms/op
c.a.p.SO28164665.binarySearchTL     1000  avgt       10    1.083 ±  0.052  ms/op
c.a.p.SO28164665.binarySearchTL     5000  avgt       10    8.246 ±  0.329  ms/op
c.a.p.SO28164665.binarySearchTL    10000  avgt       10  735.192 ± 56.071  ms/op
c.a.p.SO28164665.treeMultiSet        100  avgt       10    0.021 ±  0.001  ms/op
c.a.p.SO28164665.treeMultiSet       1000  avgt       10    0.288 ±  0.008  ms/op
c.a.p.SO28164665.treeMultiSet       5000  avgt       10    1.809 ±  0.061  ms/op
c.a.p.SO28164665.treeMultiSet      10000  avgt       10    4.283 ±  0.214  ms/op
Run Code Online (Sandbox Code Playgroud)

对于100k物品:

c.a.p.SO28164665.binarySearchAL    100000  avgt        6  890.585 ± 68.730  ms/op
c.a.p.SO28164665.treeMultiSet      100000  avgt        6  105.273 ±  9.309  ms/op
Run Code Online (Sandbox Code Playgroud)

  • +1用于二进制搜索:).但是如果OP使用`ArrayList`,插入后移位需要时间.在这种情况下,`LinkedList`会更有效. (2认同)

sma*_*c89 6

由于java没有内置的multiset,这是适合您情况的完美数据结构,我建议使用guava库中的TreeMultiset.

Multisets允许重复元素,树multiset还将添加保持集合排序的好处.