嵌套循环中的循环优化c#

Question

嵌套循环中的循环优化c#

Eva*_*ons 2 c# asp.net optimization for-loop

我正在比较从二进制文件生成的两个数据列表.我很清楚为什么它运行缓慢,当有大量记录时,它会做不需要的冗余工作.

例如,如果a1 = a1,则条件为真.既然2a!= 1a那么为什么还要费心去检查呢？我需要再次检查1a.如果我不这样做,它会在检查第400,000条记录时检查第一条记录.我考虑过将第二个for循环作为foreach,但是在迭代嵌套循环时我无法删除1a

"for循环"中可以包含的项目数量可能会有所不同.我不认为使用'i'的单个for循环将起作用,因为匹配可以在任何地方.我正在读取二进制文件

这是我目前的代码.程序已经运行了一个多小时,而且还在继续.出于可读性原因,我删除了很多迭代代码.

   for (int i = 0; i < origItemList.Count; i++)
    {
        int modFoundIndex = 0;
        Boolean foundIt = false;
        for (int g = 0; g < modItemList.Count; g++)
        {
            if ((origItemList[i].X == modItemList[g].X)
                && (origItemList[i].Y == modItemList[g].Y)
                && (origItemList[i].Z == modItemList[g].Z)
                && (origItemList[i].M == modItemList[g].M))
            {

            foundIt = true;
            modFoundIndex = g;
            break;

            }
            else
            {
                foundIt = false;
            }

        }
        if (foundIt)                         
        {
            /*
             * This is run assumming it finds an x,y,z,m 
             coordinate. It thenchecks the database file.
             * 
             */
            //grab the rows where the coordinates match 
            DataRow origRow = origDbfFile.dataset.Tables[0].Rows[i];
            DataRow modRow = modDbfFile.dataset.Tables[0].Rows[modFoundIndex];

            //number matched indicates how many columns were matched
            int numberMatched = 0;

            //get the number of columns to match in order to detect all changes
            int numOfColumnsToMatch = origDbfFile.datatable.Columns.Count;

            List<String> mismatchedColumns = new List<String>();

            //check each column name for a change
            foreach (String columnName in columnNames)
            {
                //this grabs whatever value is in that field                            
                String origRowValue = "" + origRow.Field<Object>(columnName);
                String modRowValue = "" + modRow.Field<Object>(columnName);

                //check if they are the same
                if (origRowValue.Equals(modRowValue))
                {
                    //if they aren the same, increase the number matched by one
                    numberMatched++;
                    //add the column to the list of columns that don't match

                }
                else
                {
                    mismatchedColumns.Add(columnName);
                }

            }
            /* In the event it matches 15/16 columns, show the change */
            if (numberMatched != numOfColumnsToMatch)
            {
                //Grab the shapeFile in question
                Item differentAttrShpFile = origItemList[i];
                //start blue highlighting
                result += "<div class='turnBlue'>";
                //show where the change was made at
                result += "Change Detected at<br/> point X: " +
                differentAttrShpFile.X + ",<br/> point Y: " +
                    differentAttrShpFile.Y + ",<br/>";
                result += "</div>"; //end turnblue div
                foreach (String mismatchedColumn in mismatchedColumns)
                {
                    //iterate changes here

                }

            }

        }

    }

Run Code Online (Sandbox Code Playgroud)

Answer 1

Lor*_*tel 8

你是以完全错误的方式来到这里的.你所拥有的循环是O(n ^ 2),当你发现匹配时平均会将命中时间缩短一半,这还不够.如果列表中有25万个项目,则此循环执行620亿次,即使编译器优化了额外的数组查找,您仍然需要查看至少一万亿条指令.如果你可以帮助它,你不要为大n做O(n ^ 2)!

你需要做的是摆脱这个O(n ^ 2)方面.我的建议:

1)定义一个散列函数,它查看x,y,z和m并得出一个整数值,我倾向于使用一个是你的目标平台的单词大小.

2)迭代两个列表,计算所有内容的哈希值.

3)为其中一个表,哈希和对象构建索引.我怀疑字典是最好的数据结构,但是一个简单的排序数组也可以.

4)迭代未构建索引的列表,将哈希值与索引中的条目进行比较.如果它是一个O(n)任务的哈希,如果它是一个排序数组,那么它是O(n log n).

5)当哈希匹配进行完全比较以确认命中是真实的,因为你会偶尔碰到一个好的64位哈希,如果你的哈希值是32位,你会得到相当数量的.

归档时间：	13 年前
查看次数：	1448 次
最近记录：	12 年，12 月前