为什么复制对字符串的引用比复制int要慢得多(但反之亦然,对于Array.Copy())?

Ale*_*nov 4 .net arrays performance

假设我想将数组的一部分向右移动1.我可以使用Array.Copy或只是循环复制元素:

private static void BuiltInCopy<T>(T[] arg, int start) {
    int length = arg.Length - start - 1;
    Array.Copy(arg, start, arg, start + 1, length);
}

private static void ElementByElement<T>(T[] arg, int start) {
    for (int i = arg.Length - 1; i > start; i--) {
        arg[i] = arg[i - 1];
    }
}

private static void ElementByElement2<T>(T[] arg, int start) {
    int i = arg.Length - 1;
    while (i > start)
        arg[i] = arg[--i];
}
Run Code Online (Sandbox Code Playgroud)

(ElementByElement2Matt Howells建议.)

我用Minibench测试了它,结果让我很惊讶.

internal class Program {
    private static int smallArraySize = 32;

    public static void Main(string[] args) {
        BenchArrayCopy();
    }

    private static void BenchArrayCopy() {
        var smallArrayInt = new int[smallArraySize];
        for (int i = 0; i < smallArraySize; i++)
            smallArrayInt[i] = i;

        var smallArrayString = new string[smallArraySize];
        for (int i = 0; i < smallArraySize; i++)
            smallArrayString[i] = i.ToString();

        var smallArrayDateTime = new DateTime[smallArraySize];
        for (int i = 0; i < smallArraySize; i++)
            smallArrayDateTime[i] = DateTime.Now;

        var moveInt = new TestSuite<int[], int>("Move part of array right by 1: int")
            .Plus(BuiltInCopy, "Array.Copy()")
            .Plus(ElementByElement, "Element by element (for)")
            .Plus(ElementByElement2, "Element by element (while)")
            .RunTests(smallArrayInt, 0);

        var moveString = new TestSuite<string[], string>("Move part of array right by 1: string")
            .Plus(BuiltInCopy, "Array.Copy()")
            .Plus(ElementByElement, "Element by element (for)")
            .Plus(ElementByElement2, "Element by element (while)")
            .RunTests(smallArrayString, "0");

        moveInt.Display(ResultColumns.All, moveInt.FindBest());
        moveString.Display(ResultColumns.All, moveInt.FindBest());
    }

    private static T ElementByElement<T>(T[] arg) {
        ElementByElement(arg, 1);
        return arg[0];
    }

    private static T ElementByElement2<T>(T[] arg) {
        ElementByElement2(arg, 1);
        return arg[0];
    }

    private static T BuiltInCopy<T>(T[] arg) {
        BuiltInCopy(arg, 1);
        return arg[0];
    }

    private static void BuiltInCopy<T>(T[] arg, int start) {
        int length = arg.Length - start - 1;
        Array.Copy(arg, start, arg, start + 1, length);
    }

    private static void ElementByElement<T>(T[] arg, int start) {
        for (int i = arg.Length - 1; i > start; i--) {
            arg[i] = arg[i - 1];
        }
    }

    private static void ElementByElement2<T>(T[] arg, int start) {
        int i = arg.Length - 1;
        while (i > start)
            arg[i] = arg[--i];
    }
}
Run Code Online (Sandbox Code Playgroud)

请注意,此处未测量分配.所有方法都只是复制数组元素.由于我在32位操作系统上,因此一个int和一个string引用在堆栈上占用相同的空间.

这是我期望看到的:

  1. BuiltInCopy应该是最快的,原因有两个:1)它可以做内存复制; 2)List<T>.Insert用途Array.Copy.另一方面,它是非泛型的,当数组有不同的类型时它可以做很多额外的工作,所以也许它没有充分利用1).
  2. ElementByElement应该同样快intstring.
  3. BuiltInCopy要么是同样快intstring,或更慢int(如果它做一些拳击).

但是,所有这些假设都是错误的(至少在我的.NET 3.5 SP1机器上)!

  1. BuiltInCopy<int>ElementByElement<int>32元素阵列慢得多.当尺寸增加时,BuiltInCopy<int>变得更快.
  2. ElementByElement<string>4倍以上ElementByElement<int>.
  3. BuiltInCopy<int>比...更快BuiltInCopy<string>.

任何人都可以解释这些结果吗?

更新:从CLR代码生成团队博客文章中关于数组边界检查消除:

建议4:当您复制中型到大型数组时,请使用Array.Copy,而不是显式复制循环.首先,所有范围检查将被"提升"到循环外的单个检查. 如果数组包含对象引用,您还将有效地"提升"与存储到对象类型数组相关的两个额外费用:与数组协方差相关的每个元素"存储检查"通常可以通过检查动态来消除数组的类型和与垃圾收集相关的写屏障将被聚合并变得更加有效. 最后,我们将能够使用更高效的"memcpy"式复制循环.(在即将到来的多核世界中,如果阵列足够大,甚至可能采用并行性!)

最后一列是得分(滴答中的总持续时间/迭代次数,由最佳结果标准化).

两次运行smallArraySize = 32:

f:\MyProgramming\TimSort\Benchmarks\bin\Release>Benchmarks.exe
============ Move part of array right by 1: int ============
Array.Copy()               468791028 0:30.350 1,46
Element by element (for)   637091585 0:29.895 1,06
Element by element (while) 667595468 0:29.549 1,00

============ Move part of array right by 1: string ============
Array.Copy()               432459039 0:30.929 1,62
Element by element (for)   165344842 0:30.407 4,15
Element by element (while) 150996286 0:28.399 4,25


f:\MyProgramming\TimSort\Benchmarks\bin\Release>Benchmarks.exe
============ Move part of array right by 1: int ============
Array.Copy()               459040445 0:29.262 1,38
Element by element (for)   645863535 0:30.929 1,04
Element by element (while) 651068500 0:30.064 1,00

============ Move part of array right by 1: string ============
Array.Copy()               403684808 0:30.191 1,62
Element by element (for)   162646202 0:30.051 4,00
Element by element (while) 160947492 0:30.945 4,16
Run Code Online (Sandbox Code Playgroud)

两次运行smallArraySize = 256:

f:\MyProgramming\TimSort\Benchmarks\bin\Release>Benchmarks.exe
============ Move part of array right by 1: int ============
Array.Copy()               172632756 0:30.128 1,00
Element by element (for)    91403951 0:30.253 1,90
Element by element (while)  65352624 0:29.141 2,56

============ Move part of array right by 1: string ============
Array.Copy()               153426720 0:28.964 1,08
Element by element (for)    19518483 0:30.353 8,91
Element by element (while)  19399180 0:29.793 8,80


f:\MyProgramming\TimSort\Benchmarks\bin\Release>Benchmarks.exe
============ Move part of array right by 1: int ============
Array.Copy()               184710866 0:30.456 1,00
Element by element (for)    92878947 0:29.959 1,96
Element by element (while)  73588500 0:30.331 2,50

============ Move part of array right by 1: string ============
Array.Copy()               157998697 0:30.336 1,16
Element by element (for)    19905046 0:29.995 9,14
Element by element (while)  18838572 0:29.382 9,46
Run Code Online (Sandbox Code Playgroud)

Jon*_*eet 5

需要注意的几点:

  • 因为BuiltInCopy,每次迭代你还有一个方法调用 - 你的第一个方法调用另一个然后调用的重载Array.Copy.这是一点开销.
  • 您的实现不会确切地检查重叠副本必须执行的操作.根据它们是"向上"还是"向下"移动(当目标数组与源相同时),它们应该从开始或结束起作用,以避免损坏.Array.Copy将得到正确的 - 这是开销.
  • Array.Copy采用一般Array参考,可以是多维的,不同的类型等.您的方法只能在单个数组上工作.
  • Array.Copy 做了一大堆检查排名,类型兼容性等.你的方法没有.
  • 您的方法占用的参数较少,这意味着需要在方法调用上复制较少的数据.

我不知道如何解释引用类型和值类型之间的区别,但上面应该解释为什么它不是内置副本和例程之间的公平比较.