当线程已经终止时,Thread.Join 方法并不总是返回相同的值 (.NET 5 / Core)

Rem*_*din 10 c# multithreading .net-core

Thread.Join方法具有三个重载:Join(),Join(Int32)Join(TimeSpan)。对于这三个重载中的每一个,Microsoft doc 中有以下声明:

如果调用 Join 时线程已经终止,则该方法立即返回。

虽然此语句对Join()重载有意义,但它没有指定为Join(Int32)和返回哪个值Join(TimeSpan),因此我Int32在两种不同的环境中测试了重载:

  1. Windows 10:返回
  2. Linux/Docker:返回false(在 Docker 桌面上使用mcr.microsoft.com/dotnet/runtime:5.0

请注意,如果线程在调用时仍在运行并在调用终止,则 Linux/Docker 实现将返回true(如 Windows 实现)。如果线程在调用之前终止它只会返回falseJoin

在我看来,Join无论平台如何,都应该始终返回 true,那么什么可以解释这种不一致的行为呢?我错过了什么还是.NET 5 错误?

更新

正如@txtechhelp 所建议的,这是一个 .NET Fiddle,其中包含我正在测试的确切代码。

如果我在 Windows 10(或 .NET Fiddle)上运行此代码,我会得到以下结果:

Starting..
Sleeping 1200..expect T1 end before join
In T1
Leaving T1
Join(100)..expect success
Join(100) success!
Done..
Run Code Online (Sandbox Code Playgroud)

然后,如果我在 Docker Desktop (v. 3.1.0) 上使用mcr.microsoft.com/dotnet/runtime:5.0运行此代码,则会得到以下结果:

Starting..
Sleeping 1200..expect T1 end before join
In T1
Leaving T1
Join(100)..expect success
Join(100) failed
Done..
Run Code Online (Sandbox Code Playgroud)

更新 2

实际上,经过进一步测试后,我意识到只有Join在 Docker 应用程序正在卸载时调用 ,上述测试才会失败(即在接收到AssemblyLoadContext.Default.Unloading事件后,这是 Docker 发送的信号,通知它将关闭应用程序)。

所以这是在 .NET Fiddle 上甚至失败的确切测试

public class Program
{
    public static void Main()
    {
        System.Runtime.Loader.AssemblyLoadContext.Default.Unloading += (arg) => { OnStopSignalReceived("application unloading"); };
    }

    public static void T1()
    {
        System.Console.WriteLine("In T1");
        System.Threading.Thread.Sleep(1000);
        System.Console.WriteLine("Leaving T1");
    }

    private static void OnStopSignalReceived(string stopSignalSource)
    {
        System.Threading.Thread t1 = new System.Threading.Thread(T1);
        System.Console.WriteLine("Starting..");
        t1.Start();
        System.Console.WriteLine("Sleeping 1200..expect T1 end before join");
        System.Threading.Thread.Sleep(1200);
        System.Console.WriteLine("Join(100)..expect success");
        if (t1.Join(100))
        {
            System.Console.WriteLine("Join(100) success!");
        }
        else
        {
            System.Console.WriteLine("Join(100) failed");
        }
        t1.Join();
        System.Console.WriteLine("Done..");
    }
}
Run Code Online (Sandbox Code Playgroud)

txt*_*elp 1

此问题确实似乎是该类的底层 .NET 5 代码中的错误AssemblyLoadContext或者可能是文档尚未指定的某些未定义行为;也就是说,该AssemblyLoadContext.Unloading事件的文档仅指出:

卸载 AssemblyLoadContext 时发生。

这是唯一的一句话,鉴于您遇到的问题,它没有提供太多背景信息。

话虽这么说,经过一番挖掘后,我编写了您提供的代码的两个版本,并发现了一些处理 和AssemblyLoadContext.Unloading线程的有趣行为。

此代码重现了您提到的错误:

越野车.cs

using System;
using System.Diagnostics;
using System.Threading;
using System.Runtime.Loader;

public class Program
{
        static Thread t1 = new Thread(ThreadFn);
        static Stopwatch sw = new Stopwatch();
    
        public static void Main()
        {
            AssemblyLoadContext.Default.Unloading += ContextUnloading;
            sw.Start();
            Console.WriteLine("{0}ms: leaving main", sw.ElapsedMilliseconds);
        }
    
        public static void ThreadFn()
        {
            Console.WriteLine("{0}ms: in ThreadFn, sleeping 1s", sw.ElapsedMilliseconds);
            Thread.Sleep(1000);
            Console.WriteLine("{0}ms: leaving ThreadFn", sw.ElapsedMilliseconds);
        }

        private static void ContextUnloading(AssemblyLoadContext context)
        {
            Console.WriteLine("{0}ms: unloading '{1}', thread state '{2}'", sw.ElapsedMilliseconds, context, t1.ThreadState);
            
            // possible bug/UB with t1.Start() in this function
            Console.WriteLine("{0}ms: starting thread", sw.ElapsedMilliseconds);
            t1.Start();

            Console.WriteLine("{0}ms: calling Sleep(1200); expect thread in state '{1}' to end before join called", sw.ElapsedMilliseconds, t1.ThreadState);
            Thread.Sleep(1200);
            Console.WriteLine("{0}ms: calling Join(100) on thread in state '{1}'; expect 'succeeded!'", sw.ElapsedMilliseconds, t1.ThreadState);
            Console.WriteLine("Join(100) {0}", (t1.Join(100) ? "succeeded!" : "failed"));
            Console.WriteLine("{0}ms: done", sw.ElapsedMilliseconds);
        }
}
Run Code Online (Sandbox Code Playgroud)

在 dotnetfiddle 中运行它

运行该代码给出以下结果:

0ms: leaving main
12ms: unloading '"Default" System.Runtime.Loader.DefaultAssemblyLoadContext #0', thread state 'Unstarted'
13ms: starting thread
14ms: calling Sleep(1200); expect thread in state 'Running' to end before join called
14ms: in ThreadFn, sleeping 1s
1014ms: leaving ThreadFn
1214ms: calling Join(100) on thread in state 'Stopped'; expect 'succeeded!'
Join(100) failed
1214ms: done
Run Code Online (Sandbox Code Playgroud)

您会注意到,在这个有缺陷的版本中,每个点的线程状态和时间间隔与所提供的代码相匹配。该错误发生在Join(Int32)调用该方法时;即使文档声明返回值是 a,Boolean其中值是:

true如果线程已经终止;false如果线程在参数指定的时间过后仍未终止millisecondsTimeout

给定线程 was Stopped,根据ThreadState文档,这意味着线程要么响应调用Abort(上面的代码中没有调用),要么如果

一个线程被终止。

并阅读文档Understanding System.Runtime.Loader.AssemblyLoadContext,他们甚至注意到

注意线程竞争。加载可以由多个线程触发。AssemblyLoadContext 通过以原子方式将程序集添加到其缓存来处理线程竞争。比赛失败者的实例将被丢弃。在您的实现逻辑中,不要添加无法正确处理多个线程的额外逻辑。

结合所有这些,我们假设调用应该给出上面代码中Join(Int32)的预期结果。true

所以,是的,这似乎是一个错误。

然而

如果将线程启动移动到函数中Main,而不是在卸载事件处理程序中,则在AssemblyLoadContext.Unloading线程Default完成之前不会调用上下文Join(Int32),然后调用当然会返回预期结果。

在线程完成之前Unloading不会调用该事件是有道理的,因为它可以被视为当前程序集上下文的“一部分”,但它并不能解释为什么上面代码中的错误仍然发生。

因此,虽然Join(100)调用确实按照下面的代码中的预期成功,但似乎是因为在退出AssemblyLoadContext.Unloading后没有Main像人们预期的那样被调用,而是在线程完成后调用,这具有上下文意义,但不一定在任何文档。

“成功”代码:

同步错误.cs

using System;
using System.Diagnostics;
using System.Threading;
using System.Runtime.Loader;

public class Program
{
        static Thread t1 = new Thread(ThreadFn);
        static Stopwatch sw = new Stopwatch();
    
        public static void Main()
        {
            AssemblyLoadContext.Default.Unloading += ContextUnloading;
            sw.Start();

            // Get expected result starting thread, but Unloading isn't called until AFTER the thread
            // finishes, which is not the expected result according to the .NET documentation
            Console.WriteLine("{0}ms: starting thread", sw.ElapsedMilliseconds);
            t1.Start();
            
            Console.WriteLine("{0}ms: leaving main", sw.ElapsedMilliseconds);
        }
    
        public static void ThreadFn()
        {
            Console.WriteLine("{0}ms: in ThreadFn, sleeping 1s", sw.ElapsedMilliseconds);
            Thread.Sleep(1000);
            Console.WriteLine("{0}ms: leaving ThreadFn", sw.ElapsedMilliseconds);
        }

        private static void ContextUnloading(AssemblyLoadContext context)
        {
            Console.WriteLine("{0}ms: unloading '{1}', thread state '{2}'", sw.ElapsedMilliseconds, context, t1.ThreadState);
            Console.WriteLine("{0}ms: calling Sleep(1200); expect thread in state '{1}' to end before join called", sw.ElapsedMilliseconds, t1.ThreadState);
            Thread.Sleep(1200);
            Console.WriteLine("{0}ms: calling Join(100) on thread in state '{1}'; expect 'succeeded!'", sw.ElapsedMilliseconds, t1.ThreadState);
            Console.WriteLine("Join(100) {0}", (t1.Join(100) ? "succeeded!" : "failed"));
            Console.WriteLine("{0}ms: done", sw.ElapsedMilliseconds);
        }
}
Run Code Online (Sandbox Code Playgroud)

在 dotnetfiddle 中运行它

运行该代码给出以下结果:

0ms: starting thread
11ms: leaving main
12ms: in ThreadFn, sleeping 1s
1012ms: leaving ThreadFn
1013ms: unloading '"Default" System.Runtime.Loader.DefaultAssemblyLoadContext #0', thread state 'Stopped'
1014ms: calling Sleep(1200); expect thread in state 'Stopped' to end before join called
2214ms: calling Join(100) on thread in state 'Stopped'; expect 'succeeded!'
Join(100) succeeded!
2215ms: done
Run Code Online (Sandbox Code Playgroud)

您会注意到,直到线程完成Unload才会调用该事件。

截至撰写本答案时,该类有82个未解决的错误AssemblyLoadContext346 个已关闭的错误。因此,您的问题可能已经以某种方式被注意到,但粗略的搜索没有得到任何可能与您的问题相关的结果。

由于这似乎是一个合法的错误,并且由于您对代码和正在发生的事情有更多的了解,因此我建议您转到他们的问题页面并提交一个新问题