jrw*_*jrw 3 c# performance .net-7.0
我正在开发一个性能敏感的应用程序,并考虑从 .NET 6 迁移到 .NET 7。
在比较这两个版本的过程中,我发现 .NET 7 在初始运行时执行 for 循环的速度较慢。
测试是使用两个具有相同代码的独立控制台应用程序完成的,一个在 .NET 6 上,另一个在 .NET 7 上,在任何 CPU 上以发布模式运行。
测试代码:
using System.Diagnostics;
int size = 1000000;
Stopwatch sw = new();
//create array
float[] arr = new float[size];
for (int i = 0; i < size; i++)
arr[i] = i;
Console.WriteLine(AppDomain.CurrentDomain.SetupInformation.TargetFrameworkName);
Console.WriteLine($"\nForLoop1");
ForLoop1();
ForLoop1();
ForLoop1();
ForLoop1();
ForLoop1();
Console.WriteLine($"\nForLoopArray");
ForLoopArray();
ForLoopArray();
ForLoopArray();
ForLoopArray();
ForLoopArray();
Console.WriteLine($"\nForLoop2");
ForLoop2();
ForLoop2();
ForLoop2();
ForLoop2();
ForLoop2();
void ForLoop1()
{
sw.Restart();
int sum = 0;
for (int i = 0; i < size; i++)
sum++;
sw.Stop();
Console.WriteLine($"{sw.ElapsedTicks} ticks ({sum})");
}
void ForLoopArray()
{
sw.Restart();
float sum = 0f;
for (int i = 0; i < size; i++)
sum += arr[i];
sw.Stop();
Console.WriteLine($"{sw.ElapsedTicks} ticks ({sum})");
}
void ForLoop2()
{
sw.Restart();
int sum = 0;
for (int i = 0; i < size; i++)
sum++;
sw.Stop();
Console.WriteLine($"{sw.ElapsedTicks} ticks ({sum})");
}
Run Code Online (Sandbox Code Playgroud)
.NET 6 版本的控制台输出:
using System.Diagnostics;
int size = 1000000;
Stopwatch sw = new();
//create array
float[] arr = new float[size];
for (int i = 0; i < size; i++)
arr[i] = i;
Console.WriteLine(AppDomain.CurrentDomain.SetupInformation.TargetFrameworkName);
Console.WriteLine($"\nForLoop1");
ForLoop1();
ForLoop1();
ForLoop1();
ForLoop1();
ForLoop1();
Console.WriteLine($"\nForLoopArray");
ForLoopArray();
ForLoopArray();
ForLoopArray();
ForLoopArray();
ForLoopArray();
Console.WriteLine($"\nForLoop2");
ForLoop2();
ForLoop2();
ForLoop2();
ForLoop2();
ForLoop2();
void ForLoop1()
{
sw.Restart();
int sum = 0;
for (int i = 0; i < size; i++)
sum++;
sw.Stop();
Console.WriteLine($"{sw.ElapsedTicks} ticks ({sum})");
}
void ForLoopArray()
{
sw.Restart();
float sum = 0f;
for (int i = 0; i < size; i++)
sum += arr[i];
sw.Stop();
Console.WriteLine($"{sw.ElapsedTicks} ticks ({sum})");
}
void ForLoop2()
{
sw.Restart();
int sum = 0;
for (int i = 0; i < size; i++)
sum++;
sw.Stop();
Console.WriteLine($"{sw.ElapsedTicks} ticks ({sum})");
}
Run Code Online (Sandbox Code Playgroud)
.NET 7 版本:
.NETCoreApp,Version=v6.0
ForLoop1
2989 ticks (1000000)
2846 ticks (1000000)
2851 ticks (1000000)
3180 ticks (1000000)
2841 ticks (1000000)
ForLoopArray
8270 ticks (4.9994036E+11)
8443 ticks (4.9994036E+11)
8354 ticks (4.9994036E+11)
8952 ticks (4.9994036E+11)
8458 ticks (4.9994036E+11)
ForLoop2
2842 ticks (1000000)
2844 ticks (1000000)
3117 ticks (1000000)
2835 ticks (1000000)
2992 ticks (1000000)
Run Code Online (Sandbox Code Playgroud)
如您所见,.NET 6 计时非常相似,而 .NET 7 计时显示初始高值(19658、20041 和 14016)。
摆弄环境变量 DOTNET_ReadyToRun 和 DOTNET_TieredPGO 只会让事情变得更糟。
这是为什么?如何纠正?
我的猜测是,这可以连接到.NET 7 中引入的 新堆栈替换DOTNET_JitDisasmSummary功能。启用“在我的机器上”(Windows Powershell - $env:DOTNET_JitDisasmSummary=1) 会产生以下输出:
ForLoop1
9: JIT compiled Program:<<Main>$>g__ForLoop1|0_0(byref) [Tier0, IL size=118, code size=291]
10: JIT compiled Program:<<Main>$>g__ForLoop1|0_0(byref) [Tier1-OSR @0x19, IL size=118, code size=571]
13420 ticks (1000000)
2431 ticks (1000000)
...
ForLoopArray
11: JIT compiled Program:<<Main>$>g__ForLoopArray|0_1(byref) [Tier0, IL size=129, code size=339]
12: JIT compiled Program:<<Main>$>g__ForLoopArray|0_1(byref) [Tier1-OSR @0x24, IL size=129, code size=609]
13: JIT compiled System.SpanHelpers:SequenceCompareTo(byref,int,byref,int) [Tier1, IL size=632, code size=329]
19380 ticks (4.9994036E+11)
10694 ticks (4.9994036E+11)
...
ForLoop2
14: JIT compiled Program:<<Main>$>g__ForLoop2|0_2(byref) [Tier0, IL size=118, code size=291]
15: JIT compiled Program:<<Main>$>g__ForLoop2|0_2(byref) [Tier1-OSR @0x19, IL size=118, code size=549]
11720 ticks (1000000)
2549 ticks (1000000)
...
Run Code Online (Sandbox Code Playgroud)
设置DOTNET_TC_QuickJitForLoops为 0 ( env:DOTNET_TC_QuickJitForLoops=0) 会“恢复”此行为(不知道为什么,因为文档声明默认值为false,也许 .NET 7 中发生了某些更改):
ForLoop1
9: JIT compiled Program:<<Main>$>g__ForLoop1|0_0(byref) [Tier0, IL size=118, code size=291]
10: JIT compiled Program:<<Main>$>g__ForLoop1|0_0(byref) [Tier1-OSR @0x19, IL size=118, code size=571]
13420 ticks (1000000)
2431 ticks (1000000)
...
ForLoopArray
11: JIT compiled Program:<<Main>$>g__ForLoopArray|0_1(byref) [Tier0, IL size=129, code size=339]
12: JIT compiled Program:<<Main>$>g__ForLoopArray|0_1(byref) [Tier1-OSR @0x24, IL size=129, code size=609]
13: JIT compiled System.SpanHelpers:SequenceCompareTo(byref,int,byref,int) [Tier1, IL size=632, code size=329]
19380 ticks (4.9994036E+11)
10694 ticks (4.9994036E+11)
...
ForLoop2
14: JIT compiled Program:<<Main>$>g__ForLoop2|0_2(byref) [Tier0, IL size=118, code size=291]
15: JIT compiled Program:<<Main>$>g__ForLoop2|0_2(byref) [Tier1-OSR @0x19, IL size=118, code size=549]
11720 ticks (1000000)
2549 ticks (1000000)
...
Run Code Online (Sandbox Code Playgroud)
聚苯乙烯
如果您的代码对性能敏感,尤其是对启动性能敏感,则可能值得考虑研究Native AOT。