HttpClient导致mscorlib中的Node <Object>泄漏

14 c# memory memory-leaks httpclient mscorlib

考虑以下程序,with all of HttpRequestMessage, and HttpResponseMessage, and HttpClient disposed properly.收集后最终总共会有大约50MB的内存.为请求数添加零,未回收的内存加倍.

   class Program
    {
        static void Main(string[] args)
        {
            var client = new HttpClient { 
                   BaseAddress = new Uri("http://localhost:5000/")};

            var t = Task.Run(async () =>
            {
                var resps = new List<Task<HttpResponseMessage>>();
                var postProcessing = new List<Task>();

                for (int i = 0; i < 10000; i++)
                {
                    Console.WriteLine("Firing..");
                    var req = new HttpRequestMessage(HttpMethod.Get,
                                                        "test/delay/5");
                    var tsk = client.SendAsync(req);
                    resps.Add(tsk);
                    postProcessing.Add(tsk.ContinueWith(async ts =>
                    {
                        req.Dispose();
                        var resp = ts.Result;
                        var content = await resp.Content.ReadAsStringAsync();
                        resp.Dispose();
                        Console.WriteLine(content);
                    }));
                }

                await Task.WhenAll(resps);
                resps.Clear();
                Console.WriteLine("All requests done.");
                await Task.WhenAll(postProcessing);
                postProcessing.Clear();
                Console.WriteLine("All postprocessing done.");
            });

            t.Wait();
            Console.Clear();

            var t2 = Task.Run(async () =>
            {
                var resps = new List<Task<HttpResponseMessage>>();
                var postProcessing = new List<Task>();

                for (int i = 0; i < 10000; i++)
                {
                    Console.WriteLine("Firing..");
                    var req = new HttpRequestMessage(HttpMethod.Get,
                                                        "test/delay/5");
                    var tsk = client.SendAsync(req);
                    resps.Add(tsk);
                    postProcessing.Add(tsk.ContinueWith(async ts =>
                    {
                        var resp = ts.Result;
                        var content = await resp.Content.ReadAsStringAsync();
                        Console.WriteLine(content);
                    }));
                }

                await Task.WhenAll(resps);
                resps.Clear();
                Console.WriteLine("All requests done.");
                await Task.WhenAll(postProcessing);
                postProcessing.Clear();
                Console.WriteLine("All postprocessing done.");
            });

            t2.Wait();
            Console.Clear();
            client.Dispose();

            GC.Collect();
            Console.WriteLine("Done");
            Console.ReadLine();
        }
    }
Run Code Online (Sandbox Code Playgroud)

在使用内存分析器快速调查时,似乎占用内存的对象都是mscorlib中的所有类型Node<Object>.

我的初始是,它是一些内部字典或堆栈,因为它们是使用Node作为内部结构的类型,但我无法Node<T>在参考源中找到通用的任何结果,因为这实际上是Node<object>类型.

这是一个错误,还是某种预期的优化(我不会认为内存的比例消耗始终保持为任何方式的优化)?纯粹是学术性的,是什么Node<Object>.

任何帮助理解这一点将非常感激.谢谢 :)

更新:为了推断更大的测试集的结果,我通过限制它稍微优化了它.

这是改变的计划.现在,it seems to stay consistent at 60-70MB对于100万个请求集.我仍然对那些Node<object>真正的东西感到困惑,并且它被允许保持如此大量的不可恢复的物体.

从这两个结果的差异得出的逻辑结论让我猜测,这可能不是真正的HttpClient或WebRequest的问题,而是直接与异步生根的东西 - 因为这两个测试中的真实变体是不完全异步的数量在给定时间点存在的任务.这仅仅是快速检查的推测.

static void Main(string[] args)
{

    Console.WriteLine("Ready to start.");
    Console.ReadLine();

    var client = new HttpClient { BaseAddress = 
                    new Uri("http://localhost:5000/") };

    var t = Task.Run(async () =>
    {
        var resps = new List<Task<HttpResponseMessage>>();
        var postProcessing = new List<Task>();

        for (int i = 0; i < 1000000; i++)
        {
            //Console.WriteLine("Firing..");
            var req = new HttpRequestMessage(HttpMethod.Get, "test/delay/5");
            var tsk = client.SendAsync(req);
            resps.Add(tsk);
            var n = i;
            postProcessing.Add(tsk.ContinueWith(async ts =>
            {
                var resp = ts.Result;
                var content = await resp.Content.ReadAsStringAsync();
                if (n%1000 == 0)
                {
                    Console.WriteLine("Requests processed: " + n);
                }

                //Console.WriteLine(content);
            }));

            if (n%20000 == 0)
            {
                await Task.WhenAll(resps);
                resps.Clear();
            }

        }

        await Task.WhenAll(resps);
        resps.Clear();
        Console.WriteLine("All requests done.");
        await Task.WhenAll(postProcessing);
        postProcessing.Clear();
        Console.WriteLine("All postprocessing done.");
    });

    t.Wait();
    Console.Clear();
    client.Dispose();

    GC.Collect();
    Console.WriteLine("Done");
    Console.ReadLine();
}
Run Code Online (Sandbox Code Playgroud)

And*_* Au 16

让我们用我们掌握的所有工具来研究问题.

首先,让我们看一下这些对象是什么,为了做到这一点,我将给定的代码放在Visual Studio中并创建了一个简单的控制台应用程序.并排我在Node.js上运行一个简单的HTTP服务器来提供请求.

运行客户端到最后并开始将WinDBG附加到它,我检查托管堆并获得这些结果:

0:037> !dumpheap
Address       MT     Size
02471000 00779700       10 Free
0247100c 72482744       84     
...
Statistics:
      MT    Count    TotalSize Class Name
...
72450e88      847        13552 System.Collections.Concurrent.ConcurrentStack`1+Node[[System.Object, mscorlib]]
...
Run Code Online (Sandbox Code Playgroud)

!dumpheap命令转储托管堆中的所有对象.这可能包括应该被释放的对象(但还没有,因为GC还没有被踢).在我们的例子中,这应该是罕见的,因为我们只是在打印输出之前调用GC.Collect(),并且在打印输出之后没有其他任何东西应该运行.

值得注意的是上面的具体行.那应该是你在问题中引用的Node对象.

接下来,让我们看看那个类型的单个对象,我们获取该对象的MT值,然后再次调用!dumpheap,这将只过滤掉我们感兴趣的对象.

0:037> !dumpheap -mt 72450e88   
 Address       MT     Size
025b9234 72450e88       16     
025b93dc 72450e88       16     
...
Run Code Online (Sandbox Code Playgroud)

现在在列表中抓取一个随机的,然后通过调用!gcroot命令向调试器询问为什么该对象仍在堆上:如下所示:

0:037> !gcroot 025bbc8c
Thread 6f24:
    0650f13c 79752354 System.Net.TimerThread.ThreadProc()
        edi:  (interior)
            ->  034734c8 System.Object[]
            ->  024915ec System.PinnableBufferCache
            ->  02491750 System.Collections.Concurrent.ConcurrentStack`1[[System.Object, mscorlib]]
            ->  09c2145c System.Collections.Concurrent.ConcurrentStack`1+Node[[System.Object, mscorlib]]
            ->  09c2144c System.Collections.Concurrent.ConcurrentStack`1+Node[[System.Object, mscorlib]]
            ->  025bbc8c System.Collections.Concurrent.ConcurrentStack`1+Node[[System.Object, mscorlib]]

Found 1 unique roots (run '!GCRoot -all' to see all roots).
Run Code Online (Sandbox Code Playgroud)

现在很明显我们有一个缓存,并且该缓存维护一个堆栈,堆栈实现为链表.如果我们进一步思考,我们将在参考源中看到该列表的使用方式.为此,我们首先使用!DumpObj检查缓存对象本身

0:037> !DumpObj 024915ec 
Name:        System.PinnableBufferCache
MethodTable: 797c2b44
EEClass:     795e5bc4
Size:        52(0x34) bytes
File:        C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System\v4.0_4.0.0.0__b77a5c561934e089\System.dll
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
724825fc  40004f6        4        System.String  0 instance 024914a0 m_CacheName
7248c170  40004f7        8 ...bject, mscorlib]]  0 instance 0249162c m_factory
71fe994c  40004f8        c ...bject, mscorlib]]  0 instance 02491750 m_FreeList
71fed558  40004f9       10 ...bject, mscorlib]]  0 instance 025b93b8 m_NotGen2
72484544  40004fa       14         System.Int32  1 instance        0 m_gen1CountAtLastRestock
72484544  40004fb       18         System.Int32  1 instance 605289781 m_msecNoUseBeyondFreeListSinceThisTime
7248fc58  40004fc       2c       System.Boolean  1 instance        0 m_moreThanFreeListNeeded
72484544  40004fd       1c         System.Int32  1 instance      244 m_buffersUnderManagement
72484544  40004fe       20         System.Int32  1 instance      128 m_restockSize
7248fc58  40004ff       2d       System.Boolean  1 instance        1 m_trimmingExperimentInProgress
72484544  4000500       24         System.Int32  1 instance        0 m_minBufferCount
72484544  4000501       28         System.Int32  1 instance        0 m_numAllocCalls
Run Code Online (Sandbox Code Playgroud)

现在我们看到一些有趣的东西,堆栈实际上被用作缓存的空闲列表.源代码告诉我们如何使用空闲列表,特别是在下面显示的Free()方法中:

http://referencesource.microsoft.com/#mscorlib/parent/parent/parent/parent/InternalApis/NDP_Common/inc/PinnableBufferCache.cs

/// <summary>
/// Return a buffer back to the buffer manager.
/// </summary>
[System.Security.SecuritySafeCritical]
internal void Free(object buffer)
{
  ...
  m_FreeList.Push(buffer);
}
Run Code Online (Sandbox Code Playgroud)

就这样,当调用者完成缓冲区后,它返回缓存,缓存然后将其放入空闲列表中,然后将空闲列表用于分配目的

[System.Security.SecuritySafeCritical]
internal object Allocate()
{
  // Fast path, get it from our Gen2 aged m_FreeList.  
  object returnBuffer;
  if (!m_FreeList.TryPop(out returnBuffer))
    Restock(out returnBuffer);
  ...
}
Run Code Online (Sandbox Code Playgroud)

最后但同样重要的是,让我们理解为什么在完成所有这些HTTP请求后,缓存本身不会被释放?这就是原因.通过添加在mscorlib.dll中断点!System.Collections.Concurrent.ConcurrentStack.Push(),我们可以看到以下调用堆栈(当然,这可能只是缓存的使用案例之一,但这是代表)

mscorlib.dll!System.Collections.Concurrent.ConcurrentStack<object>.Push(object item)
System.dll!System.PinnableBufferCache.Free(object buffer)
System.dll!System.Net.HttpWebRequest.FreeWriteBuffer()
System.dll!System.Net.ConnectStream.WriteHeadersCallback(System.IAsyncResult ar)
System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken)
System.dll!System.Net.ContextAwareResult.Complete(System.IntPtr userToken)
System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken)
System.dll!System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped)
mscorlib.dll!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* pOVERLAP)
Run Code Online (Sandbox Code Playgroud)

在WriteHeadersCallback中,我们完成了编写头文件,因此我们将缓冲区返回到缓存.此时缓冲区被推回到空闲列表,因此我们分配一个新的堆栈节点.需要注意的关键是缓存对象是HttpWebRequest的静态成员.

http://referencesource.microsoft.com/#System/net/System/Net/HttpWebRequest.cs

...
private static PinnableBufferCache _WriteBufferCache = new PinnableBufferCache("System.Net.HttpWebRequest", CachedWriteBufferSize);
...
// Return the buffer to the pinnable cache if it came from there.   
internal void FreeWriteBuffer()
{
  if (_WriteBufferFromPinnableCache)
  {
    _WriteBufferCache.FreeBuffer(_WriteBuffer);
    _WriteBufferFromPinnableCache = false;
  }
  _WriteBufferLength = 0;
  _WriteBuffer = null;
}
...
Run Code Online (Sandbox Code Playgroud)

所以我们去了,缓存在所有请求中共享,并且在所有请求完成后不会释放.