Linq.Max实现中的瓶颈是什么?

Tho*_*oub 7 c# linq performance

序言:我正在改变一些代码(在数组中手动​​Max搜索)到一些Linq.Max()超级性感的书写行,这让我对性能提出了问题(我经常处理大数组).所以我做了一个小程序来测试,因为我只相信我所看到的并得到了这个结果:

The size is now of 1 elements
With the loop it took:  00:00:00.0000015 
With Linq it took:      00:00:00.0000288 
The loop is faster: 94,79%
-----------------------------------------
The size is now of 10 elements
With the loop it took:  00:00:00 
With Linq it took:      00:00:00.0000007 
The loop is faster: 100,00%
-----------------------------------------
The size is now of 100 elements
With the loop it took:  00:00:00 
With Linq it took:      00:00:00.0000011 
The loop is faster: 100,00%
-----------------------------------------
The size is now of 1 000 elements
With the loop it took:  00:00:00.0000003 
With Linq it took:      00:00:00.0000078 
The loop is faster: 96,15%
-----------------------------------------
The size is now of 10 000 elements
With the loop it took:  00:00:00.0000063 
With Linq it took:      00:00:00.0000765 
The loop is faster: 91,76%
-----------------------------------------
The size is now of 100 000 elements
With the loop it took:  00:00:00.0000714 
With Linq it took:      00:00:00.0007602 
The loop is faster: 90,61%
-----------------------------------------
The size is now of 1 000 000 elements
With the loop it took:  00:00:00.0007669 
With Linq it took:      00:00:00.0081737 
The loop is faster: 90,62%
-----------------------------------------
The size is now of 10 000 000 elements
With the loop it took:  00:00:00.0070811 
With Linq it took:      00:00:00.0754348 
The loop is faster: 90,61%
-----------------------------------------
The size is now of 100 000 000 elements
With the loop it took:  00:00:00.0788133 
With Linq it took:      00:00:00.7758791 
The loop is faster: 89,84%
Run Code Online (Sandbox Code Playgroud)

简而言之,Linq几乎慢了10倍,这让我感到困扰,所以我看了一下执行Max():

public static int Max(this IEnumerable<int> source) {
    if (source == null) throw Error.ArgumentNull("source");
    int value = 0;
    bool hasValue = false;
    foreach (int x in source) {
        if (hasValue) {
            if (x > value) value = x;
        }
        else {
            value = x;
            hasValue = true;
        }
    }
    if (hasValue) return value;
    throw Error.NoElements();
}
Run Code Online (Sandbox Code Playgroud)

正如标题已经提出的那样,这个实现中的内容会让它慢10倍?(这不是关于ForEach,我已经检查过了)

编辑:

当然,我在发布模式下测试.

这是我的测试代码(没有输出):

//----------------
private int[] arrDoubles;
//----------------

Stopwatch watch = new Stopwatch();
//Stop a 100 Millions to avoid memory overflow on my laptop
for (int i = 1; i <= 100000000; i = i * 10)
{
    fillArray(i);
    watch.Restart();
    int max = Int32.MinValue; // Reset
    for (int j = 0; j < arrDoubles.Length; j++)
    {
        max = Math.Max(arrDoubles[j], max);
    }
    watch.Stop();

    TimeSpan loopSpan = watch.Elapsed;

    watch.Restart();
    max = Int32.MinValue; // Reset
    max = arrDoubles.Max();
    watch.Stop();

    TimeSpan linqSpan = watch.Elapsed;
}

//-------------------------------------------

private void fillArray(int nbValues)
{
    int Min = Int32.MinValue;
    int Max = Int32.MaxValue;
    Random randNum = new Random();
    arrDoubles = Enumerable.Repeat(0, nbValues).Select(i => randNum.Next(Min, Max)).ToArray();
}
Run Code Online (Sandbox Code Playgroud)

Mat*_*son 10

这可能会发生,因为访问数组通过IEnumerable<>比访问它的实际数组类型(即使使用foreach)要慢得多.

以下代码演示了这一点.注意代码如何max1()max2()相同; 唯一的区别是array参数的类型.在测试期间,两种方法都传递相同的对象.

尝试从RELEASE构建运行它(而不是在调试器下运行它,即使对于发布版本也会启用调试代码):

using System;
using System.Collections.Generic;
using System.Diagnostics;

namespace Demo
{
    public class Program
    {
        private static void Main(string[] args)
        {
            var array = new int[100000000];

            var sw = new Stopwatch();

            for (int trial = 0; trial < 8; ++trial)
            {
                sw.Restart();
                for (int i = 0; i < 10; ++i)
                    max1(array);
                var elapsed1 = sw.Elapsed;
                Console.WriteLine("int[] took " + elapsed1);

                sw.Restart();
                for (int i = 0; i < 10; ++i)
                    max2(array);
                var elapsed2 = sw.Elapsed;
                Console.WriteLine("IEnumerable<int> took " + elapsed2);

                Console.WriteLine("\nFirst method was {0} times faster.\n", elapsed2.TotalSeconds / elapsed1.TotalSeconds);
            }
        }

        private static int max1(int[] array)
        {
            int result = int.MinValue;

            foreach (int n in array)
                if (n > result)
                    result = n;

            return result;
        }

        private static int max2(IEnumerable<int> array)
        {
            int result = int.MinValue;

            foreach (int n in array)
                if (n > result)
                    result = n;

            return result;
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

在我的电脑上,int[]版本比IEnumerable<int>版本快10倍左右.