arv*_*ind 7 c# linq performance sum
我正在研究一个使用大量求和方法的项目的一部分.这些求和方法应用于Datatable
为了测试最佳方法,我使用以下方法
数据结构
class LogParser
{
public DataTable PGLStat_Table = new DataTable();
public LogParser()
{
PGLStat_Table.Columns.Add("type", typeof(string));
PGLStat_Table.Columns.Add("desc", typeof(string));
PGLStat_Table.Columns.Add("count", typeof(int));
PGLStat_Table.Columns.Add("duration", typeof(decimal));
PGLStat_Table.Columns.Add("cper", typeof(decimal));
PGLStat_Table.Columns.Add("dper", typeof(decimal));
PGLStat_Table.Columns.Add("occurancedata", typeof(string));
}
}
Run Code Online (Sandbox Code Playgroud)
以下方法用于填充表格
LogParser pglp = new LogParser();
Random r2 = new Random();
for (int i = 1; i < 1000000; i++)
{
int c2 = r2.Next(1, 1000);
pglp.PGLStat_Table.Rows.Add("Type" + i.ToString(), "desc" + i , c2, 0, 0, 0, " ");
}
Run Code Online (Sandbox Code Playgroud)
以下方法用于计算总和
方法1使用Compute
Stopwatch s2 = new Stopwatch();
s2.Start();
object sumObject;
sumObject = pglp.PGLStat_Table.Compute("Sum(count)", " ");
s2.Stop();
long d1 = s2.ElapsedMilliseconds;
Run Code Online (Sandbox Code Playgroud)
方法2使用Foreach循环
s2.Restart();
int totalcount = 0;
foreach (DataRow dr in pglp.PGLStat_Table.Rows)
{
int c = Convert.ToInt32(dr["count"].ToString());
totalcount = totalcount + c;
}
s2.Stop();
long d2 = s2.ElapsedMilliseconds;
Run Code Online (Sandbox Code Playgroud)
方法3使用Linq
s2.Restart();
var sum = pglp.PGLStat_Table.AsEnumerable().Sum(x => x.Field<int>("count"));
MessageBox.Show(sum.ToString());
s2.Stop();
long d3 = s2.ElapsedMilliseconds;
Run Code Online (Sandbox Code Playgroud)
比较结果后
a)foreach是最快的481ms
b)接下来是linq 1016ms
c)然后计算2253ms
查询1
我在下面的语句中意外地将"c2改为i"
pglp.PGLStat_Table.Rows.Add("Type" + i.ToString(), "desc" + i , i, 0, 0, 0, " ");
Run Code Online (Sandbox Code Playgroud)
Linq语句产生错误
算术运算导致溢出.
而Compute和Foreach循环仍然能够完成计算,尽管可能不正确.
这种行为是引起关注还是我错过了指令?(计算的数字也很大)
查询2
我的印象是Linq做得最快,是否有优化的方法或参数使其表现更好.
谢谢你的建议
阿文德
接下来是最快的总和(使用预计算 DataColumn 并直接转换为 int):
static int Sum(LogParser pglp)
{
var column = pglp.PGLStat_Table.Columns["count"];
int totalcount = 0;
foreach (DataRow dr in pglp.PGLStat_Table.Rows)
{
totalcount += (int)dr[column];
}
return totalcount;
}
Run Code Online (Sandbox Code Playgroud)
统计:
00:00:00.1442297, for/each, by column, (int)
00:00:00.1595430, for/each, by column, Field<int>
00:00:00.6961964, for/each, by name, Convert.ToInt
00:00:00.1959104, linq, cast<DataRow>, by column, (int)
Run Code Online (Sandbox Code Playgroud)
其他代码:
static int Sum_ForEach_ByColumn_Field(LogParser pglp)
{
var column = pglp.PGLStat_Table.Columns["count"];
int totalcount = 0;
foreach (DataRow dr in pglp.PGLStat_Table.Rows)
{
totalcount += dr.Field<int>(column);
}
return totalcount;
}
static int Sum_ForEach_ByName_Convert(LogParser pglp)
{
int totalcount = 0;
foreach (DataRow dr in pglp.PGLStat_Table.Rows)
{
int c = Convert.ToInt32(dr["count"].ToString());
totalcount = totalcount + c;
}
return totalcount;
}
static int Sum_Linq(LogParser pglp)
{
var column = pglp.PGLStat_Table.Columns["count"];
return pglp.PGLStat_Table.Rows.Cast<DataRow>().Sum(row => (int)row[column]);
}
var data = GenerateData();
Sum(data);
Sum_Linq2(data);
var count = 3;
foreach (var info in new[]
{
new {Name = "for/each, by column, (int)", Method = (Func<LogParser, int>)Sum},
new {Name = "for/each, by column, Field<int>", Method = (Func<LogParser, int>)Sum_ForEach_ByColumn_Field},
new {Name = "for/each, by name, Convert.ToInt", Method = (Func<LogParser, int>)Sum_ForEach_ByName_Convert},
new {Name = "linq, cast<DataRow>, by column, (int)", Method = (Func<LogParser, int>)Sum_Linq},
})
{
var watch = new Stopwatch();
for (var i = 0; i < count; ++i)
{
watch.Start();
var sum = info.Method(data);
watch.Stop();
}
Console.WriteLine("{0}, {1}", TimeSpan.FromTicks(watch.Elapsed.Ticks / count), info.Name);
}
Run Code Online (Sandbox Code Playgroud)