iCe*_*iCe 6 .net iis debugging performance
我正在调查我们的一台生产服务器上的高CPU使用率,我被困住了,并希望这里有人能够提供帮助.将CPU使用率与入站Web服务调用数(WCF和REST)进行比较会告诉我它们是无关的,每当呼叫增加或减少时,CPU使用率似乎都会上升.
最近我们的CPU使用量大幅增加,模式似乎是CPU使用率从午夜(我回收的游泳池)爬升并继续攀升直到游泳池再次被回收.
通过分析taskmanager和使用perfmon计数器,我已经能够确认我们的W3WP进程确实消耗了CPU.
当CPU消耗超过50%的阈值超过10秒时,我将DebugDiag 2.1配置为相隔10秒进行3次内存转储.(此问题之前的正常CPU使用率为5-10%)
使用ntsd查看转储文件,我发现有几个线程消耗了大量的CPU:
0:047> !runaway
User Mode Time
Thread Time
47:2920 0 days 0:24:42.921
49:1f1c 0 days 0:23:07.796
52:2ed8 0 days 0:21:38.218
54:1560 0 days 0:21:37.937
48:273c 0 days 0:21:19.140
59:2110 0 days 0:20:56.078
45:2d90 0 days 0:20:35.906
...
19:1c88 0 days 0:00:00.000
Run Code Online (Sandbox Code Playgroud)
(这里只显示一些线程)所以我试着看看这些线程中发生了什么,因为这不是预期的行为.所有具有长时间运行任务的线程似乎都被管理了,但是当我尝试的时候!clrstack对他们中的任何一个我得到:
0:047> !clrstack
OS Thread Id: 0x2920 (47)
Child SP IP Call Site
GetFrameContext failed: 1
0000000000000000 0000000000000000 <unknown>
Run Code Online (Sandbox Code Playgroud)
这让我感到困惑,因为我期待托管堆栈.当我查看本机堆栈时,我会得到一些不同的东西:
0:047> !dumpstack
OS Thread Id: 0x2920 (47)
Current frame: ntdll!NtWaitForSingleObject+0xa
Child-SP RetAddr Caller, Callee
0000009c46eae730 00007fff0c131118 KERNELBASE!WaitForSingleObjectEx+0x94, calling ntdll!NtWaitForSingleObject
0000009c46eae7d0 00007fff051f91eb clr!CLREventWaitHelper2+0x38, calling kernel32!WaitForSingleObjectEx
0000009c46eae7e0 00007fff0c13155c KERNELBASE!SetEvent+0xc, calling ntdll!NtSetEvent
0000009c46eae810 00007fff051f9197 clr!CLREventWaitHelper+0x1f, calling clr!CLREventWaitHelper2
0000009c46eae870 00007fff051f9120 clr!CLREventBase::WaitEx+0x70, calling clr!CLREventWaitHelper
0000009c46eae8b0 00007fff052890e6 clr!SVR::t_join::join+0x106, calling clr!CLREventBase::WaitEx
0000009c46eae900 00007fff053d5913 clr!SVR::gc_heap::bgc_thread_function+0x97, calling clr!CLREventBase::WaitEx
0000009c46eae940 00007fff0533fcb6 clr!Thread::intermediateThreadProc+0x7d
0000009c46eaeb10 00007fff0ef4086d ntdll!RtlAllocateHeap+0x17d, calling ntdll!RtlpAllocateHeap
0000009c46eaec20 00007fff0ef40073 ntdll!RtlpSubSegmentInitialize+0x2f3, calling ntdll!RtlpHeapGenerateRandomValue32
0000009c46eaeca0 00007fff0ef40c65 ntdll!RtlpLowFragHeapAllocFromContext+0x355, calling ntdll!memset
0000009c46eaed10 00007fff0ef40c65 ntdll!RtlpLowFragHeapAllocFromContext+0x355, calling ntdll!memset
0000009c46eaed70 00007fff0ef407c7 ntdll!RtlAllocateHeap+0xd7, calling ntdll!RtlpLowFragHeapAllocFromContext
0000009c46eaede0 00007fff0ef407c7 ntdll!RtlAllocateHeap+0xd7, calling ntdll!RtlpLowFragHeapAllocFromContext
0000009c46eaee80 00007fff05f5a89a mscoree!calloc_impl+0x72, calling ntdll!RtlAllocateHeap
0000009c46eaeeb0 00007fff0c135ac4 KERNELBASE!SetTEBLangID+0x2c, calling ntdll!RtlSetLastWin32ErrorAndNtStatusFromNtStatus
0000009c46eaeef0 00007fff05cf15e6 mscoreei!calloc_impl+0x5d, calling ntdll!RtlAllocateHeap
0000009c46eaef20 00007fff05cf145b mscoreei!initptd+0xb7, calling mscoreei!unlock
0000009c46eaef30 00007fff0ef40c65 ntdll!RtlpLowFragHeapAllocFromContext+0x355, calling ntdll!memset
0000009c46eaef50 00007fff05cf138e mscoreei!CRT_INIT+0x135, calling kernel32!GetCurrentThreadId
0000009c46eaef80 00007fff05cf11ee mscoreei!__DllMainCRTStartup+0x8a, calling mscoreei!DllMain
0000009c46eaefe0 00007fff0c3b1387 00007fff0c3b1387
0000009c46eaf030 00007fff04c111d2 00007fff04c111d2, calling 00007fff04c11070
0000009c46eaf0c0 00007fff0ef407c7 ntdll!RtlAllocateHeap+0xd7, calling ntdll!RtlpLowFragHeapAllocFromContext
0000009c46eaf0f0 00007fff0ef2c187 ntdll!RtlDeactivateActivationContextUnsafeFast+0xc7, calling ntdll!_security_check_cookie
0000009c46eaf110 00007fff05085c6e MSVCR120_CLR0400!calloc_impl+0x5d, calling ntdll!RtlAllocateHeap
0000009c46eaf120 00007fff0ef2c2a3 ntdll!RtlActivateActivationContextUnsafeFast+0x93, calling ntdll!_security_check_cookie
0000009c46eaf140 00007fff05085d9b MSVCR120_CLR0400!initptd+0xb7, calling MSVCR120_CLR0400!unlock
0000009c46eaf160 00007fff03064d9c clrjit!__DllMainCRTStartup+0x8d, calling clrjit!DllMain
0000009c46eaf190 00007fff0ef4b9b8 ntdll!LdrpReleaseModuleEnumLock+0x1c, calling ntdll!RtlReleaseSRWLockShared
0000009c46eaf1a0 00007fff0ef2c324 ntdll!LdrpCallInitRoutine+0x4c
0000009c46eaf1c0 00007fff0ef4b96b ntdll!LdrpReleaseLoaderLock+0x27, calling ntdll!LdrpReleaseModuleEnumLock
0000009c46eaf200 00007fff0ef2c083 ntdll!LdrpInitializeThread+0x1f3, calling ntdll!LdrpReleaseLoaderLock
0000009c46eaf270 00007fff0ef2bfc3 ntdll!LdrpInitializeThread+0x133, calling ntdll!RtlActivateActivationContextUnsafeFast
0000009c46eaf278 00007fff0ef2bff6 ntdll!LdrpInitializeThread+0x166, calling ntdll!RtlDeactivateActivationContextUnsafeFast
0000009c46eaf2e0 00007fff0ef28fa3 ntdll!_LdrpInitialize+0x93, calling ntdll!NtTestAlert
0000009c46eaf350 00007fff0ef28ec8 ntdll!LdrInitializeThunk+0x18, calling ntdll!NtContinue
0000009c46eaf7c0 00007fff0533fc9f clr!Thread::intermediateThreadProc+0x66, calling clr!_chkstk
0000009c46eaf800 00007fff0e7713d2 kernel32!BaseThreadInitThunk+0x22
0000009c46eaf830 00007fff0ef25444 ntdll!RtlUserThreadStart+0x34
Run Code Online (Sandbox Code Playgroud)
这让我相信线程正在等待一些资源(这是正确的吗?),这就是我真正迷失的地方!在跟踪中间的那些行中发生了什么:
0000009c46eaefe0 00007fff0c3b1387 00007fff0c3b1387
0000009c46eaf030 00007fff04c111d2 00007fff04c111d2, calling 00007fff04c11070
Run Code Online (Sandbox Code Playgroud)
我的猜测是一些托管的东西正在这里,但为什么呢!clrstack不工作呢?通过查看第一帧,它看起来像是在等待一些资源处理.看起来我的手柄是0xa,但我对此并不确定.用句柄查看句柄0xa ff我得到这个:
0:047> !handle 0xa ff
Handle 000000000000000a
Type File
Attributes 0
GrantedAccess 0x100020:
Synch
Execute/Traverse
HandleCount 2
PointerCount 65535
No object specific information available
Run Code Online (Sandbox Code Playgroud)
告诉我,这指向一个文件,但是什么文件,我该怎么从这里继续?看看其他顶级跑道线程给了我相同的图片.
我知道这是一个巨大的任务,但我真的需要从这里继续前进的帮助.我是在正确的轨道还是我只是在黑暗中探索?任何帮助将不胜感激!
在完成我们的IT部门之后 记录一个perfMon数据集给我一些我觉得有趣的计数器我得出结论:1)我们正在泄漏线程2)GC变得疯狂(可能是因为泄漏).关于如何检测导致线程泄漏的原因的任何想法?见这里的柜台:2
在发现我们正在泄漏线程之后,我一直在查看我们的代码并找到了一些感兴趣的代码:
// Initialize TimerExecutionEveryMinute timer
const double timeConversion = 60 * 1000; //convert one minute to milliseconds
var timer1 = new System.Timers.Timer { Enabled = true, Interval = timeConversion };
timer1.Elapsed += TimerExecutionEveryMinute;
Run Code Online (Sandbox Code Playgroud)
然后:
// Execution every minute
public static void TimerExecutionEveryMinute(object sender, EventArgs e)
{
var jpsLogger = KernelContainer.Kernel.Get<IJpsLogger>();
// Initialize MemBag
MemBag.Log.ActivityIdReset(Guid.NewGuid());
MemBag.Log.BaseType = "TimerExecution";
MemBag.Log.BaseClass = "Timer";
MemBag.Log.BaseMethod = "TimerExecutionEveryMinute";
// Statistic timer job
var t1 = jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", "One minute timer begin");
var t2 = jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", " Method.WriteDB begin");
Method.WriteDB();
jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", " Method.WriteDB end", t2);
t2 = jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", " Memory.LogCurrentState begin");
Memory.LogCurrentState();
jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", " Memory.LogCurrentState end", t2);
//Calculates the CPU load based on samples taken at every timer step
t2 = jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", " CPU load begin");
CPULogger.LogCpu();
jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", " CPU load end", t2);
// Dump log information to file
t2 = jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", " FileLogger.WriteAsync begin");
FileLogger.WriteAsync();
jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", " FileLogger.WriteAsync end", t2);
jpsLogger.Trace.SpecializedDebug("Analyses.OneMinuteTimer", "One minute timer end", t1);
}
Run Code Online (Sandbox Code Playgroud)
这可能是泄漏线程的地方吗?我相信System.Timers.Timer每次创建事件时都会旋转新线程并且它是线程安全的所以我在执行代码周围创建锁,每隔一分钟执行的代码写入日志文件,我的论文如果访问文件被阻塞,线程不断堆积,这将解释逻辑线程的线性增加数量,如2所示
好的,这是您问题的部分答案。看起来 CPU 使用率高的原因有很多,但我设法找到了 GC 发疯的原因之一。
在我们代码的某个地方,开发人员插入了一段代码,该代码每分钟记录当前的内存使用情况(用于监视和调试目的)。这是这样实现的:
GC.GetTotalMemory(true)
Run Code Online (Sandbox Code Playgroud)
在文档中查找此方法告诉我,bool 参数实际上会在每次调用此方法时强制执行完整 GC - 每一分钟执行一次完整 GC!难怪我们的 CPU 和 GC 时间激增。将其更改为 false 可使 CPU 使用率减半。我们仍然存在性能/资源问题,但这是一个巨大的进步。
希望这对某人有帮助。
归档时间: |
|
查看次数: |
1828 次 |
最近记录: |