LINQ中的OrderBy和Top具有良好的性能

DRB*_*ise 12 c# linq performance linq-to-objects sql-order-by

从大型集合中获取前10条记录并使用自定义OrderBy的好方法是什么?如果我使用LINQ to Objects OrderBy方法,它会很慢并占用大量内存,因为它会使用新订单创建一个完整的新集合.我想要一个带有下面签名的新方法,它不会重新整理整个集合并且非常快:

public static IEnumerable<TSource> OrderByTop<TSource, TKey>(
    IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    IComparer<TKey> comparer,
    int topCount)
Run Code Online (Sandbox Code Playgroud)

我试着写它但它变得非常复杂,我想可能有更简单的方法使用Aggregate或其他东西.任何帮助,将不胜感激.

回答

谢谢您的帮助.我最终得到了以下代码:

public static List<TSource> OrderByTop<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    IComparer<TKey> comparer,
    int topCount)
{
    var itemComparer = keySelector.ToIComparer(comparer);
    return source.Aggregate(
        new List<TSource>(topCount),
        (List<TSource> list, TSource item) =>
            list.SortedInsert(item, itemComparer, topCount));
}
Run Code Online (Sandbox Code Playgroud)

List Extension方法SortedInsert如下:

public static List<T> SortedInsert<T>(
    this List<T> list,
    T item,
    IComparer<T> comparer,
    int maxLength)
{
    if (list.Count == maxLength)
        if (comparer.Compare(item, list[maxLength - 1]) >= 0)
            return list;
        else
            list.RemoveAt(maxLength - 1);
    int insertIndex = list.BinarySearch(item, comparer);
    if (insertIndex < 0)
        insertIndex = ~insertIndex;
    list.Insert(insertIndex, item);
    return list;
}
Run Code Online (Sandbox Code Playgroud)

对于那些感兴趣的人我也有keySelector Extension方法转换为IComparer.

public static IComparer<TSource> ToIComparer<TSource, TKey>(
    this Func<TSource, TKey> keySelector,
    IComparer<TKey> comparer)
{
    return new KeySelectorToIComparerConverter<TSource, TKey>(
        keySelector,
        comparer);
}
private class KeySelectorToIComparerConverter<TSource, TKey>
    : IComparer<TSource>
{
    private readonly IComparer<TKey> comparer;
    private readonly Func<TSource, TKey> keySelector;
    public KeySelectorToIComparerConverter(
        Func<TSource, TKey> keySelector,
        IComparer<TKey> comparer)
    {
        this.comparer = comparer;
        this.keySelector = keySelector;
    }
    public int Compare(TSource x, TSource y)
    {
        return comparer.Compare(keySelector(x), keySelector(y));
    }
}
Run Code Online (Sandbox Code Playgroud)

Mar*_*ner 8

Aggregate 是一个开始的好地方:

SortedList<TKey, TSource> resultlist = new SortedList<TKey, TSource>();
MyBigList.Aggregate(resultlist, (aktlist,entry) => {
   aktlist.Add(entry.Key, entry);
   if (aktlist.Count > 10) aktlist.RemoveAt(10);
   return aktlist;
});
Run Code Online (Sandbox Code Playgroud)

如果你想要一个不同的比较器,你可以在构造函数中指定一个SortedList.

编辑正如nikie所提到的,SortedList不能包含双重值.您可以使用标准列表BinarySearch来实现相同的效果:

List<TSource> resultlist = new List<TSource>();
MyBigList.Aggregate(resultlist, (aktlist, entry) => {
   int index = aktlist.BinarySearch(entry);
   if (index < 0) index = ~index;
   if (index < 10) aktlist.Insert(index, entry);
   if (aktlist.Count > 10) aktlist.RemoveAt(10);
   return aktlist;
});
Run Code Online (Sandbox Code Playgroud)

同样,自定义比较器(以及自定义键选择)可用作参数BinarySearch.

  • 当密钥已存在时,IIRC SortedList会引发异常. (2认同)
  • 非常好!它应该是RemoveAt(10),但像nikie说它不接受重复键. (2认同)