Sam*_*ron 13 c# performance hashset
出于某种原因,似乎Adda HashSet上的Contains操作比元素已经存在时的操作要慢HashSet.
这是证明:
Stopwatch watch = new Stopwatch();
int size = 10000;
int iterations = 10000;
var s = new HashSet<int>();
for (int i = 0; i < size; i++) {
s.Add(i);
}
Console.WriteLine(watch.Time(() =>
{
for (int i = 0; i < size; i++) {
s.Add(i);
}
}, iterations));
s = new HashSet<int>();
for (int i = 0; i < size; i++) {
s.Add(i);
}
// outputs: 47,074,764
Console.WriteLine(watch.Time(() =>
{
for (int i = 0; i < size; i++) {
if (!s.Contains(i))
s.Add(i);
}
}, iterations));
// outputs: 41,125,219
Run Code Online (Sandbox Code Playgroud)
为什么Contains比Add现有元素更快?
注意:我正在使用Stopwatch另一个SO问题的扩展名.
public static long Time(this Stopwatch sw, Action action, int iterations) {
sw.Reset();
sw.Start();
for (int i = 0; i < iterations; i++) {
action();
}
sw.Stop();
return sw.ElapsedTicks;
}
Run Code Online (Sandbox Code Playgroud)
更新:内部测试表明,大型性能差异只发生在.NET框架的x64版本上.使用32位版本的框架Contains似乎以相同的速度运行添加(实际上看起来带有包含的版本在某些测试运行中运行速度慢了一些)在X64版本的框架上,带有包含的版本似乎运行速度提高约15%.
AddIfNotPresent执行Contains不执行的额外划分.看一下IL for Contains:
IL_000a: call instance int32 class System.Collections.Generic.HashSet`1<!T>::InternalGetHashCode(!0)
IL_000f: stloc.0
IL_0010: ldarg.0
IL_0011: ldfld int32[] class System.Collections.Generic.HashSet`1<!T>::m_buckets
IL_0016: ldloc.0
IL_0017: ldarg.0
IL_0018: ldfld int32[] class System.Collections.Generic.HashSet`1<!T>::m_buckets
IL_001d: ldlen
IL_001e: conv.i4
IL_001f: rem
IL_0020: ldelem.i4
IL_0021: ldc.i4.1
IL_0022: sub
IL_0023: stloc.1
Run Code Online (Sandbox Code Playgroud)
这是计算哈希码的存储区位置.结果保存在本地内存位置1.
AddIfNotPresent执行类似的操作,但它还将计算值保存在位置2,以便在项目不存在时它可以将项目插入到该位置的哈希表中.这样做可以节省,因为其中一个位置稍后会在寻找项目的循环中进行修改.无论如何,这是AddIfNotPresent的相关代码:
IL_0011: call instance int32 class System.Collections.Generic.HashSet`1<!T>::InternalGetHashCode(!0)
IL_0016: stloc.0
IL_0017: ldloc.0
IL_0018: ldarg.0
IL_0019: ldfld int32[] class System.Collections.Generic.HashSet`1<!T>::m_buckets
IL_001e: ldlen
IL_001f: conv.i4
IL_0020: rem
IL_0021: stloc.1
IL_0022: ldarg.0
IL_0023: ldfld int32[] class System.Collections.Generic.HashSet`1<!T>::m_buckets
IL_0028: ldloc.0
IL_0029: ldarg.0
IL_002a: ldfld int32[] class System.Collections.Generic.HashSet`1<!T>::m_buckets
IL_002f: ldlen
IL_0030: conv.i4
IL_0031: rem
IL_0032: ldelem.i4
IL_0033: ldc.i4.1
IL_0034: sub
IL_0035: stloc.2
Run Code Online (Sandbox Code Playgroud)
无论如何,我认为额外的分歧是导致Add比Case包含更多时间的原因.乍一看,似乎可以考虑额外的分歧,但我不能肯定地说,不花一点时间来破译IL.
| 归档时间: |
|
| 查看次数: |
9607 次 |
| 最近记录: |