为什么.NET中的多维数组比普通数组慢?

Hos*_*Aly 49 .net arrays performance

编辑:我向大家道歉.我实际上想要说"多维数组"时使用了"锯齿状数组"一词(如下面的例子所示).我为使用错误的名字道歉.我实际上发现锯齿状阵列比多维阵列更快!我已经为锯齿状阵列添加了测量值.

我试图用一个 盘陀今天的多维数组,当我注意到它的性能并不像我预期的那样.使用单维数组和手动计算索引要比使用2D数组快得多(几乎两倍).我使用1024*1024数组(初始化为随机值)编写了一个测试 ,进行了1000次迭代,我在我的机器上得到了以下结果:

sum(double[], int): 2738 ms (100%)
sum(double[,]):     5019 ms (183%)
sum(double[][]):    2540 ms ( 93%)
Run Code Online (Sandbox Code Playgroud)

这是我的测试代码:

public static double sum(double[] d, int l1) {
    // assuming the array is rectangular
    double sum = 0;
    int l2 = d.Length / l1;
    for (int i = 0; i < l1; ++i)
        for (int j = 0; j < l2; ++j)
            sum += d[i * l2 + j];
    return sum;
}

public static double sum(double[,] d) {
    double sum = 0;
    int l1 = d.GetLength(0);
    int l2 = d.GetLength(1);
    for (int i = 0; i < l1; ++i)
        for (int j = 0; j < l2; ++j)
            sum += d[i, j];
    return sum;
}

public static double sum(double[][] d) {
    double sum = 0;
    for (int i = 0; i < d.Length; ++i)
        for (int j = 0; j < d[i].Length; ++j)
            sum += d[i][j];
    return sum;
}

public static void Main() {
    Random random = new Random();
    const int l1  = 1024, l2 = 1024;
    double[ ] d1  = new double[l1 * l2];
    double[,] d2  = new double[l1 , l2];
    double[][] d3 = new double[l1][];

    for (int i = 0; i < l1; ++i) {
        d3[i] = new double[l2];
        for (int j = 0; j < l2; ++j)
            d3[i][j] = d2[i, j] = d1[i * l2 + j] = random.NextDouble();
    }
    //
    const int iterations = 1000;
    TestTime(sum, d1, l1, iterations);
    TestTime(sum, d2, iterations);
    TestTime(sum, d3, iterations);
}
Run Code Online (Sandbox Code Playgroud)

进一步研究表明,第二种方法的IL比第一种方法大23%.(代码大小68比52)这主要是由于呼叫System.Array::GetLength(int).编译器还发出呼吁Array::Get盘陀多维数组,而它只需要ldelem简单的数组.

所以我想知道,为什么通过多维数组访问比普通数组更慢?我会假设编译器(或JIT)会做类似于我在第一种方法中所做的事情,但事实并非如此.

你能不能帮助我理解为什么会发生这种情况?


更新:根据Henk Holterman的建议,以下是实施TestTime:

public static void TestTime<T, TR>(Func<T, TR> action, T obj,
                                   int iterations)
{
    Stopwatch stopwatch = Stopwatch.StartNew();
    for (int i = 0; i < iterations; ++i)
        action(obj);
    Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}

public static void TestTime<T1, T2, TR>(Func<T1, T2, TR> action, T1 obj1,
                                        T2 obj2, int iterations)
{
    Stopwatch stopwatch = Stopwatch.StartNew();
    for (int i = 0; i < iterations; ++i)
        action(obj1, obj2);
    Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}
Run Code Online (Sandbox Code Playgroud)

Jon*_*eet 44

下限为0的单维数组与IL内的多维或非0下限数组不同(vector对比arrayIIRC).vector更容易使用 - 要获得元素x,你就是这么做pointer + size * x.对于a array,您必须pointer + size * (x-lower bound)为单维数组执行操作,并为您添加的每个维度添加更多算术.

基本上,CLR针对更常见的情况进行了优化.

  • 我对此感到困惑,多维数组应该比锯齿状数组更快.如果有的话,这是CLR的错. (23认同)
  • 一个好的编译器应该能够在循环前移动所有绑定的检查,并为d2生成与d1基本相同的代码.这证明了MS编译器不是很好(对于数组). (2认同)
  • @ILoveFortran:JIT编译器(实际发出或省略检查)针对执行速度进行了大量优化 - 目标是让JIT编译比典型的页面错误更快.即使这样,x64 JIT编译器也能完全实现你正在讨论的最优化,而新的编译器(还没有生产版本),RyuJIT,设法进行更多的优化.事实上,即使是x86编译器也会删除如果使用`for(i = 0; i <ar.Length; i ++)`,则完全绑定检查,因为这可以保证for循环本身的边界检查. (2认同)

Jee*_*Bee 10

数组边界检查?

单维数组有一个可以直接访问的长度成员 - 编译时这只是一个内存读取.

多维数组需要GetLength(int dimension)方法调用,该方法调用处理参数以获取该维度的相关长度.这不会编译成内存读取,所以你得到一个方法调用,等等.

此外,GetLength(int dimension)将对参数进行边界检查.


Cam*_*ron 5

有趣的是,我在 Vista 机器上使用 VS2008 NET3.5SP1 Win32 从上面运行了以下代码,在发布/优化中几乎无法测量差异,而调试/noopt 多暗阵列要慢得多。(我运行了三个测试两次以减少第二组的 JIT 影响。)

  Here are my numbers: 
    sum took 00:00:04.3356535
    sum took 00:00:04.1957663
    sum took 00:00:04.5523050
    sum took 00:00:04.0183060
    sum took 00:00:04.1785843 
    sum took 00:00:04.4933085
Run Code Online (Sandbox Code Playgroud)

看第二组三个数字。差异不足以让我对一维数组中的所有内容进行编码。

虽然我没有发布它们,但在调试/未优化中,多维与单/锯齿确实有很大的不同。

完整程序:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;

namespace single_dimension_vs_multidimension
{
    class Program
    {


        public static double sum(double[] d, int l1) {    // assuming the array is rectangular 
            double sum = 0; 
            int l2 = d.Length / l1; 
            for (int i = 0; i < l1; ++i)   
                for (int j = 0; j < l2; ++j)   
                    sum += d[i * l2 + j];   
            return sum;
        }

        public static double sum(double[,] d)
        {
            double sum = 0;  
            int l1 = d.GetLength(0);
            int l2 = d.GetLength(1);   
            for (int i = 0; i < l1; ++i)    
                for (int j = 0; j < l2; ++j)   
                    sum += d[i, j]; 
            return sum;
        }
        public static double sum(double[][] d)
        {
            double sum = 0;   
            for (int i = 0; i < d.Length; ++i) 
                for (int j = 0; j < d[i].Length; ++j) 
                    sum += d[i][j];
            return sum;
        }
        public static void TestTime<T, TR>(Func<T, TR> action, T obj, int iterations) 
        { 
            Stopwatch stopwatch = Stopwatch.StartNew();
            for (int i = 0; i < iterations; ++i)      
                action(obj);
            Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
        }
        public static void TestTime<T1, T2, TR>(Func<T1, T2, TR> action, T1 obj1, T2 obj2, int iterations)
        {
            Stopwatch stopwatch = Stopwatch.StartNew(); 
            for (int i = 0; i < iterations; ++i)    
                action(obj1, obj2); 
            Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
        }
        public static void Main() {   
            Random random = new Random(); 
            const int l1  = 1024, l2 = 1024; 
            double[ ] d1  = new double[l1 * l2]; 
            double[,] d2  = new double[l1 , l2];  
            double[][] d3 = new double[l1][];   
            for (int i = 0; i < l1; ++i)
            {
                d3[i] = new double[l2];   
                for (int j = 0; j < l2; ++j)  
                    d3[i][j] = d2[i, j] = d1[i * l2 + j] = random.NextDouble();
            }    
            const int iterations = 1000;
            TestTime<double[], int, double>(sum, d1, l1, iterations);
            TestTime<double[,], double>(sum, d2, iterations);

            TestTime<double[][], double>(sum, d3, iterations);
            TestTime<double[], int, double>(sum, d1, l1, iterations);
            TestTime<double[,], double>(sum, d2, iterations);
            TestTime<double[][], double>(sum, d3, iterations); 
        }

    }
}
Run Code Online (Sandbox Code Playgroud)