.NET 6 并行操作中 MaxDegreeOfParallelism = -1 的含义是什么?

The*_*ias 2 c# task-parallel-library parallel.foreach .net-6.0 parallel.foreachasync

该财产的文件ParallelOptions.MaxDegreeOfParallelism指出:

该属性会影响传递此实例的方法调用MaxDegreeOfParallelism运行的并发操作数。正的属性值将并发操作的数量限制为设定值。如果为-1,则并发运行的操作数没有限制。ParallelParallelOptions

默认情况下,ForForEach利用底层调度程序提供的线程数量,因此更改MaxDegreeOfParallelism默认值只会限制将使用的并发任务数量。

我试图理解“无限制”在这种情况下意味着什么。根据以上文档摘录,我的期望是Parallel.Invoke配置的操作MaxDegreeOfParallelism = -1将立即开始并行执行所有提供的actions. 但事实并非如此。这是一个包含 12 个操作的实验:

int concurrency = 0;
Action action = new Action(() =>
{
    var current = Interlocked.Increment(ref concurrency);
    Console.WriteLine(@$"Started an action at {DateTime
        .Now:HH:mm:ss.fff} on thread #{Thread
        .CurrentThread.ManagedThreadId} with concurrency {current}");
    Thread.Sleep(1000);
    Interlocked.Decrement(ref concurrency);
});
Action[] actions = Enumerable.Repeat(action, 12).ToArray();
var options = new ParallelOptions() { MaxDegreeOfParallelism = -1 };
Parallel.Invoke(options, actions);
Run Code Online (Sandbox Code Playgroud)

输出:

int concurrency = 0;
Action action = new Action(() =>
{
    var current = Interlocked.Increment(ref concurrency);
    Console.WriteLine(@$"Started an action at {DateTime
        .Now:HH:mm:ss.fff} on thread #{Thread
        .CurrentThread.ManagedThreadId} with concurrency {current}");
    Thread.Sleep(1000);
    Interlocked.Decrement(ref concurrency);
});
Action[] actions = Enumerable.Repeat(action, 12).ToArray();
var options = new ParallelOptions() { MaxDegreeOfParallelism = -1 };
Parallel.Invoke(options, actions);
Run Code Online (Sandbox Code Playgroud)

现场演示

这个实验的结果与我的预期不符。并非所有操作都会立即调用。记录的最大并发数是6,有时是7,但不是12。所以“无限制”并不意味着我认为的意思。我的问题是:对于所有四种方法(、和) ,配置的确切MaxDegreeOfParallelism = -1含义是什么?我想详细了解这些方法以这种方式配置时的行为。如果 .NET 版本之间存在行为差异,我对当前的 .NET 版本 (.NET 6) 很感兴趣,它也引入了新的API。ParallelForForEachForEachAsyncInvokeParallel.ForEachAsync

第二个问题:在这些方法中MaxDegreeOfParallelism = -1省略可选参数是否完全相同?parallelOptions


澄清:我对使用默认配置Parallel时方法的行为感兴趣。我对使用专门或自定义调度程序可能出现的任何复杂情况不感兴趣。TaskScheduler

tym*_*tam 5

该定义特意声明为,-1 means that the number of number of concurrent operations will not be artificially limited.并没有说所有操作都将立即开始。

线程池管理器通常将可用线程数保持在核心数(或逻辑处理器数,即核心数的 2 倍),这被认为是最佳线程数(我认为这个数字是 [核心数/逻辑处理器数 + 1])。这意味着当您开始执行操作时,立即开始工作的可用线程数就是这个数字。

线程池管理器定期运行(每秒两次),如果没有线程完成,则添加新线程(或者当线程太多时在相反的情况下删除)。

要了解这一点,一个很好的实验是快速连续运行两次实验。在第一个实例中,开始时的并发作业数应约为核心/逻辑处理器数 + 1,在第二次运行中,它应为运行的作业数(因为创建这些线程是为了服务第一次运行):

这是您的代码的修改版本:

using System.Diagnostics;

Stopwatch sw = Stopwatch.StartNew();
int concurrency = 0;
Action action = new Action(() =>
{
    var current = Interlocked.Increment(ref concurrency);
    Console.WriteLine(@$"Started at {sw.ElapsedMilliseconds} with concurrency {current}");
    Thread.Sleep(10_000);
    current = Interlocked.Decrement(ref concurrency);
});


Action[] actions = Enumerable.Repeat(action, 12).ToArray();
var options = new ParallelOptions() { MaxDegreeOfParallelism = -1 };
Parallel.Invoke(options, actions);

Parallel.Invoke(options, actions);
Run Code Online (Sandbox Code Playgroud)

输出:

Started at 114 with concurrency 8
Started at 114 with concurrency 1
Started at 114 with concurrency 2
Started at 114 with concurrency 3
Started at 114 with concurrency 4
Started at 114 with concurrency 6
Started at 114 with concurrency 5
Started at 114 with concurrency 7
Started at 114 with concurrency 9
Started at 1100 with concurrency 10
Started at 2097 with concurrency 11
Started at 3100 with concurrency 12
Started at 13110 with concurrency 1
Started at 13110 with concurrency 2
Started at 13110 with concurrency 3
Started at 13110 with concurrency 5
Started at 13110 with concurrency 7
Started at 13110 with concurrency 9
Started at 13110 with concurrency 10
Started at 13110 with concurrency 11
Started at 13110 with concurrency 4
Started at 13110 with concurrency 12
Started at 13110 with concurrency 6
Started at 13110 with concurrency 8
Run Code Online (Sandbox Code Playgroud)

我的计算机有 4 个核心(8 个逻辑处理器),当作业在“冷”状态下运行时,TaskScheduler.Default首先会立即启动其中的 8+1 个线程,然后定期添加一个新线程。

然后,当“热”运行第二批时,所有作业都会同时启动。

Parallel.ForEachAsync

当运行类似的示例时,Parallel.ForEachAsync行为会有所不同。这项工作是在恒定的并行水平上完成的。请注意,这与线程无关,因为如果您await Task.Delay(因此不阻塞线程)并行作业的数量保持不变。

如果我们查看该版本的源代码,ParallelOptions就会 parallelOptions.EffectiveMaxConcurrencyLevel看到dop真正起作用的私有方法。

public static Task ForEachAsync<TSource>(IEnumerable<TSource> source!!, ParallelOptions parallelOptions!!, Func<TSource, CancellationToken, ValueTask> body!!)
{
     return ForEachAsync(source, parallelOptions.EffectiveMaxConcurrencyLevel, ...);
}
Run Code Online (Sandbox Code Playgroud)

如果我们进一步观察,我们可以看到:

  • “dop”被记录为“一个整数,指示允许并行运行的操作数。'。
  • 实际的并行度是DefaultDegreeOfParallelism
/// <param name="dop">A integer indicating how many operations to allow to run in parallel.</param>
(...)
private static Task ForEachAsync<TSource>(IEnumerable<TSource> source, int dop,
{
    ...

    if (dop < 0)
    {
        dop = DefaultDegreeOfParallelism;
    }
Run Code Online (Sandbox Code Playgroud)

最后看一眼,我们可以看到最终的值为Environment.ProcessorCount

private static int DefaultDegreeOfParallelism => Environment.ProcessorCount;
Run Code Online (Sandbox Code Playgroud)

这就是现在的情况,我不确定在 .NET 7 中是否会保持这样。