C# 启动进程即使被杀死并释放也会泄漏内存(在 Linux 上)

Teh*_*hGM 6 .net c# linux memory-leaks .net-core

注意:根据测试(请参阅下面的编辑),这种情况仅发生在 Linux 计算机上。

我有一个在 Raspberry Pi 上运行的 ASP.NET Core Blazor 应用程序(使用服务器端托管模型)。该应用程序的部分功能是根据系统上次交互的时间来调暗/调亮屏幕。为此,我每隔 1 秒左右生成一个终端子进程来运行xprintidle、解析其输出并采取相应的操作。

我使用DataDog进行监控,并且出现内存泄漏,直到系统崩溃(需要几天的时间才能用完所有内存,但最终确实会发生):
在此输入图像描述

我已经指出以下方法是泄漏内存的原因 - 如果我跳过调用它并使用一些恒定的时间跨度,内存不会泄漏:我有以下代码来执行此操作:

// note this code has some parts that aren't even needed - I was simply trying anything to solve this problem at this point
public async Task<TerminalResult> ExecuteAndWaitAsync(string command, bool asRoot, CancellationToken cancellationToken = default)
{
    using Process prc = CreateNewProcess(command, asRoot);
    // we need to redirect stdstreams to read them
    prc.StartInfo.RedirectStandardOutput = true;
    prc.StartInfo.RedirectStandardError = true;

    // start the process
    _log.LogTrace("Starting the process");
    using Task waitForExitTask = WaitForExitAsync(prc, cancellationToken);
    prc.Start();

    // read streams
    string[] streamResults = await Task.WhenAll(prc.StandardOutput.ReadToEndAsync(), prc.StandardError.ReadToEndAsync()).ConfigureAwait(false);

    // wait till it fully exits, but no longer than half a second
    // this prevents hanging when process has already finished, but takes long time to fully close
    await Task.WhenAny(waitForExitTask, Task.Delay(500, cancellationToken)).ConfigureAwait(false);
    // if process still didn't exit, force kill it
    if (!prc.HasExited)
        prc.Kill(true);  // doing it with a try-catch approach instead of HasExited check gives no difference
    return new TerminalResult(streamResults[0], streamResults[1]);
}

public Task<int> WaitForExitAsync(Process process, CancellationToken cancellationToken = default)
{
    TaskCompletionSource<int> tcs = new TaskCompletionSource<int>();
    IDisposable tokenRegistration = null;
    EventHandler callback = null;
    tokenRegistration = cancellationToken.Register(() =>
    {
        Unregister();
        tcs.TrySetCanceled(cancellationToken);
    });
    callback = (sender, args) =>
    {
        Unregister();
        tcs.TrySetResult(process.ExitCode);
    };
    process.Exited += callback;
    process.EnableRaisingEvents = true;

    void Unregister()
    {
        lock (tcs)
        {
            if (tokenRegistration == null)
                return;
            process.EnableRaisingEvents = false;
            process.Exited -= callback;
            tokenRegistration?.Dispose();
            tokenRegistration = null;
        }
    }

    return tcs.Task;
}

private Process CreateNewProcess(string command, bool asRoot)
{
    _log.LogDebug("Creating process: {Command}", command);
    Process prc = new Process();

    if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
    {
        string escapedCommand = command.Replace("\"", "\\\"");
        // if as root, just sudo it
        if (asRoot)
            prc.StartInfo = new ProcessStartInfo("/bin/bash", $"-c \"sudo {escapedCommand}\"");
        // if not as root, we need to open it as current user
        // this may still run as root if the process is running as root
        else
            prc.StartInfo = new ProcessStartInfo("/bin/bash", $"-c \"{escapedCommand}\"");
    }
    else if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
    {
        prc.StartInfo = new ProcessStartInfo("CMD.exe", $"/C {command}");
        if (asRoot)
            prc.StartInfo.Verb = "runas";
    }
    else
        throw new PlatformNotSupportedException($"{nameof(ExecuteAndWaitAsync)} is only supported on Windows and Linux platforms.");

    prc.StartInfo.UseShellExecute = false;
    prc.StartInfo.CreateNoWindow = true;

    if (_log.IsEnabled(LogLevel.Trace))
    {
        _log.LogTrace("exec: {FileName} {Args}", prc.StartInfo.FileName, prc.StartInfo.Arguments);
        _log.LogTrace("exec: as root = {AsRoot}", asRoot);
    }

    return prc;
}
Run Code Online (Sandbox Code Playgroud)

我花了很多时间(几个月的时间——字面意义上的)尝试各种改变来解决这个问题——WaitForExitAsync进行了很多大修,尝试了不同的处理方式。我尝试定期调用 GC.Collect() 。还尝试使用服务器和工作站 GC 模式运行应用程序。

正如我之前提到的,我很确定是这段代码泄漏了 - 如果我不调用ExecuteAndWaitAsync,就不存在内存泄漏。结果类也不由调用者存储 - 它只是解析一个值并立即使用它:

public async Task<TimeSpan> GetSystemIdleTimeAsync(CancellationToken cancellationToken = default)
{
    ThrowIfNotLinux();

    const string prc = "xprintidle";
    TerminalResult result = await _terminal.ExecuteAndWaitAsync(prc, false, cancellationToken).ConfigureAwait(false);
    if (result.HasErrors || !int.TryParse(result.Output, out int idleMs))
        throw new InvalidOperationException($"{prc} returned invalid data.");
    return TimeSpan.FromMilliseconds(idleMs);
}

private static void ThrowIfNotLinux()
{
    if (!RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
        throw new PlatformNotSupportedException($"{nameof(BacklightControl)} is only functional on Linux systems.");
}
Run Code Online (Sandbox Code Playgroud)

我错过了什么吗?是进程类泄漏还是我读取输出的方式泄漏?

编辑:正如评论中的人们所问的那样,我创建了最少的可运行代码,基本上是在单个类中获取所有相关方法并在循环中执行。该代码可作为要点: https://gist.github.com/TehGM/c953b670ad8019b2b2be6af7b14807c2
我在 Windows 机器和 Raspberry Pi 上运行了它。在 Windows 上,内存看起来很稳定,但在 Raspberry Pi 上,内存明显泄漏。我尝试了两者xprintidleifconfig确保这不仅仅是 xprintidle 的问题。尝试了.NET Core 3.0和.NET Core 3.1,效果基本相同。 在此输入图像描述

sro*_*oll 3

这可能是由 .NET Core 2.2 和 .NET Core 3.0 之间的回归引起的,显然它将在 3.1.7 版本中修复

由于未释放句柄,仅仅启动进程就会导致 Linux 上的内存泄漏

问题已在此处跟踪https://github.com/dotnet/runtime/issues/36661