缓存IEnumerable

djs*_*ner 19 c# ienumerable yield return

public IEnumerable<ModuleData> ListModules()
{
    foreach (XElement m in Source.Descendants("Module"))
    {
        yield return new ModuleData(m.Element("ModuleID").Value);
    }
}
Run Code Online (Sandbox Code Playgroud)

最初上面的代码很棒,因为如果不需要,就不需要评估整个集合.

但是,一旦枚举了所有模块,在没有更改时重复查询XDocument会变得更加昂贵.

因此,作为绩效改进:

public IEnumerable<ModuleData> ListModules()
{
    if (Modules == null)
    {
        Modules = new List<ModuleData>();
        foreach (XElement m in Source.Descendants("Module"))
        {
            Modules.Add(new ModuleData(m.Element("ModuleID").Value, 1, 1));
        }
    }
    return Modules;
}
Run Code Online (Sandbox Code Playgroud)

如果我反复使用整个列表,那就太好了,但不是那么好.

是否存在中间点,我可以在整个列表被迭代之前返回,然后缓存它并将缓存提供给后续请求?

Dzm*_*uba 10

您可以查看保存枚举器状态,其中描述了如何创建惰性列表(缓存项目时缓存一次).

  • 对于后代,您能否在答案中包含您认为有用的链接的相关部分?这样,如果链接断开、更改等,您的答案将不会变得毫无用处。非常感谢。 (2认同)

Gra*_*meF 6

看看MemoizeAll()无扩展的.NET库(Rx)的。由于它的评估是懒惰的,因此您可以在施工期间安全地进行设置,并且只需ModulesListModules()以下位置返回即可:

Modules = Source.
    Descendants("Module").
    Select(m => new ModuleData(m.Element("ModuleID").Value, 1, 1)).
    MemoizeAll();
Run Code Online (Sandbox Code Playgroud)

有一个很好的解释MemoizeAll()(其他一些不太明显的Rx扩展和)在这里


tse*_*mer 5

我已经看到了一些实现,一些较旧的并且没有利用最新的 .Net 类,有些过于复杂,无法满足我的需求。我最终得到了我能收集到的最简洁和声明性的代码,它加起来就是一个包含大约 15 行(实际)代码的类。它似乎很符合 OP 的需求:

编辑:第二次修订,更好地支持空枚举

/// <summary>
/// A <see cref="IEnumerable{T}"/> that caches every item upon first enumeration.
/// </summary>
/// <seealso cref="http://blogs.msdn.com/b/matt/archive/2008/03/14/digging-deeper-into-lazy-and-functional-c.aspx"/>
/// <seealso cref="http://blogs.msdn.com/b/wesdyer/archive/2007/02/13/the-virtues-of-laziness.aspx"/>
public class CachedEnumerable<T> : IEnumerable<T> {
  private readonly bool _hasItem; // Needed so an empty enumerable will not return null but an actual empty enumerable.
  private readonly T _item;
  private readonly Lazy<CachedEnumerable<T>> _nextItems;

  /// <summary>
  /// Initialises a new instance of <see cref="CachedEnumerable{T}"/> using <paramref name="item"/> as the current item
  /// and <paramref name="nextItems"/> as a value factory for the <see cref="CachedEnumerable{T}"/> containing the next items.
  /// </summary>
  protected internal CachedEnumerable(T item, Func<CachedEnumerable<T>> nextItems) {
    _hasItem = true;
    _item = item;
    _nextItems = new Lazy<CachedEnumerable<T>>(nextItems);
  }

  /// <summary>
  /// Initialises a new instance of <see cref="CachedEnumerable{T}"/> with no current item and no next items.
  /// </summary>
  protected internal CachedEnumerable() {
    _hasItem = false;
  }

  /// <summary>
  /// Instantiates and returns a <see cref="CachedEnumerable{T}"/> for a given <paramref name="enumerable"/>.
  /// Notice: The first item is always iterated through.
  /// </summary>
  public static CachedEnumerable<T> Create(IEnumerable<T> enumerable) {
    return Create(enumerable.GetEnumerator());
  }

  /// <summary>
  /// Instantiates and returns a <see cref="CachedEnumerable{T}"/> for a given <paramref name="enumerator"/>.
  /// Notice: The first item is always iterated through.
  /// </summary>
  private static CachedEnumerable<T> Create(IEnumerator<T> enumerator) {
    return enumerator.MoveNext() ? new CachedEnumerable<T>(enumerator.Current, () => Create(enumerator)) : new CachedEnumerable<T>();
  }

  /// <summary>
  /// Returns an enumerator that iterates through the collection.
  /// </summary>
  public IEnumerator<T> GetEnumerator() {
    if (_hasItem) {
      yield return _item;

      var nextItems = _nextItems.Value;
      if (nextItems != null) {
        foreach (var nextItem in nextItems) {
          yield return nextItem;
        }
      }
    }
  }

  /// <summary>
  /// Returns an enumerator that iterates through a collection.
  /// </summary>
  IEnumerator IEnumerable.GetEnumerator() {
    return GetEnumerator();
  }
}
Run Code Online (Sandbox Code Playgroud)

一个有用的扩展方法可能是:

public static class IEnumerableExtensions {
  /// <summary>
  /// Instantiates and returns a <see cref="CachedEnumerable{T}"/> for a given <paramref name="enumerable"/>.
  /// Notice: The first item is always iterated through.
  /// </summary>
  public static CachedEnumerable<T> ToCachedEnumerable<T>(this IEnumerable<T> enumerable) {
    return CachedEnumerable<T>.Create(enumerable);
  }
}
Run Code Online (Sandbox Code Playgroud)

对于你们中间的单元测试人员:(如果你不使用 resharper,只需取出[SuppressMessage]属性)

/// <summary>
/// Tests the <see cref="CachedEnumerable{T}"/> class.
/// </summary>
[TestFixture]
public class CachedEnumerableTest {
  private int _count;

  /// <remarks>
  /// This test case is only here to emphasise the problem with <see cref="IEnumerable{T}"/> which <see cref="CachedEnumerable{T}"/> attempts to solve.
  /// </remarks>
  [Test]
  [SuppressMessage("ReSharper", "PossibleMultipleEnumeration")]
  [SuppressMessage("ReSharper", "ReturnValueOfPureMethodIsNotUsed")]
  public void MultipleEnumerationAreNotCachedForOriginalIEnumerable() {
    _count = 0;

    var enumerable = Enumerable.Range(1, 40).Select(IncrementCount);

    enumerable.Take(3).ToArray();
    enumerable.Take(10).ToArray();
    enumerable.Take(4).ToArray();

    Assert.AreEqual(17, _count);
  }

  /// <remarks>
  /// This test case is only here to emphasise the problem with <see cref="IList{T}"/> which <see cref="CachedEnumerable{T}"/> attempts to solve.
  /// </remarks>
  [Test]
  [SuppressMessage("ReSharper", "PossibleMultipleEnumeration")]
  [SuppressMessage("ReSharper", "ReturnValueOfPureMethodIsNotUsed")]
  public void EntireListIsEnumeratedForOriginalListOrArray() {
    _count = 0;
    Enumerable.Range(1, 40).Select(IncrementCount).ToList();
    Assert.AreEqual(40, _count);

    _count = 0;
    Enumerable.Range(1, 40).Select(IncrementCount).ToArray();
    Assert.AreEqual(40, _count);
  }

  [Test]
  [SuppressMessage("ReSharper", "ReturnValueOfPureMethodIsNotUsed")]
  public void MultipleEnumerationsAreCached() {
    _count = 0;

    var cachedEnumerable = Enumerable.Range(1, 40).Select(IncrementCount).ToCachedEnumerable();

    cachedEnumerable.Take(3).ToArray();
    cachedEnumerable.Take(10).ToArray();
    cachedEnumerable.Take(4).ToArray();

    Assert.AreEqual(10, _count);
  }

  [Test]
  public void FreshCachedEnumerableDoesNotEnumerateExceptFirstItem() {
    _count = 0;

    Enumerable.Range(1, 40).Select(IncrementCount).ToCachedEnumerable();

    Assert.AreEqual(1, _count);
  }

  /// <remarks>
  /// Based on Jon Skeet's test mentioned here: http://www.siepman.nl/blog/post/2013/10/09/LazyList-A-better-LINQ-result-cache-than-List.aspx
  /// </remarks>
  [Test]
  [SuppressMessage("ReSharper", "LoopCanBeConvertedToQuery")]
  public void MatrixEnumerationIteratesAsExpectedWhileStillKeepingEnumeratedValuesCached() {
    _count = 0;

    var cachedEnumerable = Enumerable.Range(1, 5).Select(IncrementCount).ToCachedEnumerable();

    var matrixCount = 0;

    foreach (var x in cachedEnumerable) {
      foreach (var y in cachedEnumerable) {
        matrixCount++;
      }
    }

    Assert.AreEqual(5, _count);
    Assert.AreEqual(25, matrixCount);
  }

  [Test]
  public void OrderingCachedEnumerableWorksAsExpectedWhileStillKeepingEnumeratedValuesCached() {
    _count = 0;

    var cachedEnumerable = Enumerable.Range(1, 5).Select(IncrementCount).ToCachedEnumerable();

    var orderedEnumerated = cachedEnumerable.OrderBy(x => x);
    var orderedEnumeratedArray = orderedEnumerated.ToArray(); // Enumerated first time in ascending order.
    Assert.AreEqual(5, _count);

    for (int i = 0; i < orderedEnumeratedArray.Length; i++) {
      Assert.AreEqual(i + 1, orderedEnumeratedArray[i]);
    }

    var reorderedEnumeratedArray = orderedEnumerated.OrderByDescending(x => x).ToArray(); // Enumerated second time in descending order.
    Assert.AreEqual(5, _count);

    for (int i = 0; i < reorderedEnumeratedArray.Length; i++) {
      Assert.AreEqual(5 - i, reorderedEnumeratedArray[i]);
    }
  }

  private int IncrementCount(int value) {
    _count++;
    return value;
  }
}
Run Code Online (Sandbox Code Playgroud)


haz*_*zik 5

我喜欢@tsemer的回答。但是我想提出我的解决方案,它与FP无关。这是一种幼稚的方法,但是它产生的分配要少得多。而且它不是线程安全的。

public class CachedEnumerable<T> : IEnumerable<T>, IDisposable
{
    IEnumerator<T> _enumerator;
    readonly List<T> _cache = new List<T>();

    public CachedEnumerable(IEnumerable<T> enumerable) 
        : this(enumerable.GetEnumerator())
    {
    }

    public CachedEnumerable(IEnumerator<T> enumerator)
    {
        _enumerator = enumerator;
    }

    public IEnumerator<T> GetEnumerator()
    {
        // The index of the current item in the cache.
        int index = 0;

        // Enumerate the _cache first
        for (; index < _cache.Count; index++)
        {
            yield return _cache[index];
        }

        // Continue enumeration of the original _enumerator, 
        // until it is finished. 
        // This adds items to the cache and increment 
        for (; _enumerator != null && _enumerator.MoveNext(); index++)
        {
            var current = _enumerator.Current;
            _cache.Add(current);
            yield return current;
        }

        if (_enumerator != null)
        {
            _enumerator.Dispose();
            _enumerator = null;
        }

        // Some other users of the same instance of CachedEnumerable
        // can add more items to the cache, 
        // so we need to enumerate them as well
        for (; index < _cache.Count; index++)
        {
            yield return _cache[index];
        }
    }

    public void Dispose()
    {
        if (_enumerator != null)
        {
            _enumerator.Dispose();
            _enumerator = null;
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}
Run Code Online (Sandbox Code Playgroud)

这就是@tsemer的答案中的矩阵测试的工作方式:

var ints = new [] { 1, 2, 3, 4, 5 };
var cachedEnumerable = new CachedEnumerable<int>(ints); 
foreach (var x in cachedEnumerable)
{
    foreach (var y in cachedEnumerable)
    {
        //Do something
    }
}
Run Code Online (Sandbox Code Playgroud)
  1. 外循环(x)首先跳过for,因为_cache为空;
  2. x_enumerator到取一项_cache;
  3. x在第二个for循环之前暂停;
  4. 内部循环(y)从中枚举一个元素_cache;
  5. y获取从_enumerator到的所有元素_cache
  6. y跳过第三个for循环,因为它的index变量等于5;
  7. x恢复,index等于1for因为_enumerator完成,它跳过了第二个循环。
  8. x_cache使用第三for循环中枚举一个元素;
  9. x在第三点之前停顿for;
  10. y_cache使用第一for循环中枚举5个元素;
  11. y跳过第二个for循环,因为_enumerator已完成;
  12. y跳过第三for循环,因为,indexy平等5;
  13. x恢复,递增index。它_cache使用第三个for循环从中获取一个元素。
  14. x 停顿一下。
  15. 如果的index变量x小于5则转到10;
  16. 结束。


Mr *_*r D 5

我非常喜欢哈齐克的回答……美好而简单始终是正确的方式。但是 GetEnumerator 有一个错误

它意识到存在问题,这就是为什么在第二个枚举器循环之后有一个奇怪的第三个循环......但它并不那么简单。触发第三个循环的问题是普遍的......所以它需要是递归的。

但答案看起来更简单。

    public IEnumerator<T> GetEnumerator()
    {
        int index = 0;

        while (true)
        {
            if (index < _cache.Count)
            {
                yield return _cache[index];
                index = index + 1;
            }
            else
            {
                if (_enumerator.MoveNext())
                {
                    _cache.Add(_enumerator.Current);
                }
                else
                {
                    yield break;
                }
            }
        }
    }
Run Code Online (Sandbox Code Playgroud)

是的,你可以通过产生电流来提高它的效率……但我会采取微秒级的打击……每个元素只发生一次。

而且它不是线程安全的......但谁在乎呢。