抛出异常时,使用Lazy <T>的StackOverflowException

Rau*_*otz 9 c#

一个非常简单的示例应用程序(.NET 4.6.2)在12737的递归深度处产生StackOverflowException ,如果最内部函数调用抛出异常(预期和正常),则递减深度为10243.

如果我使用a Lazy<T>来短暂保存中间结果,则StackOverflowException已经发生在2207的递归深度,如果没有抛出异常并且在递归深度为105,则抛出异常.

注意:如果编译为x64,则深度为105的StackOverflowException 仅可观察.使用x86(32位),效果首先发生在4272的深度.Mono(就像在https://repl.it上使用的那样)可以毫无问题地工作到74200的深度.

StackOverflowException不会在深度递归中发生,而是在升级回主程序时发生.finally块被深度处理,然后程序死掉:

Exception System.InvalidOperationException at 105
Finally at 105
...
Exception System.InvalidOperationException at 55
Finally at 55
Exception System.InvalidOperationException at 54
Finally at 54
Process is terminated due to StackOverflowException.
Run Code Online (Sandbox Code Playgroud)

或者在调试器中:

The program '[xxxxx] Test.vshost.exe' has exited with code -2147023895 (0x800703e9).
Run Code Online (Sandbox Code Playgroud)

谁能解释一下这个?

public class Program
{
    private class Test
    {
        private int maxDepth;

        private int CalculateWithLazy(int depth)
        {
            try
            {
                var lazy = new Lazy<int>(() => this.Calculate(depth));
                return lazy.Value;
            }  
            catch (Exception e)
            {
                Console.WriteLine("Exception " + e.GetType() + " at " + depth);
                throw;
            }
            finally
            {
                Console.WriteLine("Finally at " + depth);
            }
        }

        private int Calculate(int depth)
        {
            if (depth >= this.maxDepth) throw new InvalidOperationException("Max. recursion depth reached.");
            return this.CalculateWithLazy(depth + 1);
        }

        public void Run()
        {
            for (int i = 1; i < 100000; i++)
            {
                this.maxDepth = i;

                try
                {
                    Console.WriteLine("MaxDepth: " + i);
                    this.CalculateWithLazy(0);

                }
                catch { /* ignore */ }
            }
        }
    }

    public static void Main(string[] args)
    {
        var test = new Test();
        test.Run();
        Console.Read();
    }
Run Code Online (Sandbox Code Playgroud)

更新:问题可以在不使用的情况下重现Lazy<T>,只需在递归方法中使用try-catch-throw块即可.

        [MethodImpl(MethodImplOptions.NoInlining)]
        private int Calculate(int depth)
        {
            try
            {
                if (depth >= this.maxDepth) throw new InvalidOperationException("Max. recursion depth reached.");
                return this.Calculate2(depth + 1);
            }
            catch
            {
                throw;
            }
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        private int Calculate2(int depth) // just to prevent the compiler from tail-recursion-optimization
        {
            return this.Calculate(depth);
        }

        public void Run()
        {
            for (int i = 1; i < 100000; i++)
            {
                this.maxDepth = i;

                try
                {
                    Console.WriteLine("MaxDepth: " + i);
                    this.Calculate(0);

                }
                catch(Exception e)
                {
                    Console.WriteLine("Finished with " + e.GetType());
                }
            }
        }
Run Code Online (Sandbox Code Playgroud)

Luc*_*ski 8

Lazy<T>只需在递归方法中使用try-catch-throw块,就可以在不使用的情况下重现该问题.

您已经注意到了该行为的来源.现在让我解释一下为什么,因为这没有道理,对吧?

这没有任何意义,因为异常被捕获然后立即重新抛出,所以堆栈应该缩小,对吧?

以下测试:

internal class Program
{
    private int _maxDepth;

    [MethodImpl(MethodImplOptions.NoInlining)]
    private int Calculate(int depth)
    {
        try
        {
            Console.WriteLine("In try at depth {0}: stack frame count = {1}", depth, new StackTrace().FrameCount);

            if (depth >= _maxDepth)
                throw new InvalidOperationException("Max. recursion depth reached.");

            return Calculate2(depth + 1);
        }
        catch
        {
            Console.WriteLine("In catch at depth {0}: stack frame count = {1}", depth, new StackTrace().FrameCount);
            throw;
        }
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private int Calculate2(int depth) => Calculate(depth);

    public void Run()
    {
        try
        {
            _maxDepth = 10;
            Calculate(0);
        }
        catch (Exception e)
        {
            Console.WriteLine("Finished with " + e.GetType());
        }
    }

    public static void Main() => new Program().Run();
}
Run Code Online (Sandbox Code Playgroud)

似乎验证了这个假设:

In try at depth 0: stack frame count = 3
In try at depth 1: stack frame count = 5
In try at depth 2: stack frame count = 7
In try at depth 3: stack frame count = 9
In try at depth 4: stack frame count = 11
In try at depth 5: stack frame count = 13
In try at depth 6: stack frame count = 15
In try at depth 7: stack frame count = 17
In try at depth 8: stack frame count = 19
In try at depth 9: stack frame count = 21
In try at depth 10: stack frame count = 23
In catch at depth 10: stack frame count = 23
In catch at depth 9: stack frame count = 21
In catch at depth 8: stack frame count = 19
In catch at depth 7: stack frame count = 17
In catch at depth 6: stack frame count = 15
In catch at depth 5: stack frame count = 13
In catch at depth 4: stack frame count = 11
In catch at depth 3: stack frame count = 9
In catch at depth 2: stack frame count = 7
In catch at depth 1: stack frame count = 5
In catch at depth 0: stack frame count = 3
Finished with System.InvalidOperationException
Run Code Online (Sandbox Code Playgroud)

嗯......不.这不是那么简单.


.NET异常建立在Windows 结构化异常处理(SEH)之上,这是一个复杂的野兽.

如果您想了解详细信息,则需要阅读两篇文章.它们已经过时了,但与您的问题相关的部分仍然准确:

但是让我们专注于手头的问题,即堆栈展开.

这是第一个说明当堆栈解开时会发生什么(强调我的):

另一种形式的展开是CPU堆栈的实际弹出.这并不像 SEH记录的爆发那样热切.在X86上,EBP用作包含SEH的方法的帧指针.ESP一如既往地指向堆栈的顶部.在堆栈实际解开之前,所有处理程序都在错误的异常帧之上执行.因此,当为第一次或第二次传递调用处理程序时,堆栈实际上会增长.EBP设置为包含filter或finally子句的方法的框架,以便该方法的局部变量将在范围内.

在执行捕获'except'子句之前,不会发生堆栈的实际弹出.

让我们修改我们之前的测试程序来检查:

internal class Program
{
    private int _maxDepth;
    private UIntPtr _stackStart;

    [MethodImpl(MethodImplOptions.NoInlining)]
    private int Calculate(int depth)
    {
        try
        {
            Console.WriteLine("In try at depth {0}: stack frame count = {1}, stack offset = {2}",depth, new StackTrace().FrameCount, GetLooseStackOffset());

            if (depth >= _maxDepth)
                throw new InvalidOperationException("Max. recursion depth reached.");

            return Calculate2(depth + 1);
        }
        catch
        {
            Console.WriteLine("In catch at depth {0}: stack frame count = {1}, stack offset = {2}", depth, new StackTrace().FrameCount, GetLooseStackOffset());
            throw;
        }
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private int Calculate2(int depth) => Calculate(depth);

    public void Run()
    {
        try
        {
            _stackStart = GetSomePointerNearTheTopOfTheStack();
            _maxDepth = 10;
            Calculate(0);
        }
        catch (Exception e)
        {
            Console.WriteLine("Finished with " + e.GetType());
        }
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static unsafe UIntPtr GetSomePointerNearTheTopOfTheStack()
    {
        int dummy;
        return new UIntPtr(&dummy);
    }

    private int GetLooseStackOffset() => (int)((ulong)_stackStart - (ulong)GetSomePointerNearTheTopOfTheStack());

    public static void Main() => new Program().Run();
}
Run Code Online (Sandbox Code Playgroud)

这是结果:

In try at depth 0: stack frame count = 3, stack offset = 384
In try at depth 1: stack frame count = 5, stack offset = 752
In try at depth 2: stack frame count = 7, stack offset = 1120
In try at depth 3: stack frame count = 9, stack offset = 1488
In try at depth 4: stack frame count = 11, stack offset = 1856
In try at depth 5: stack frame count = 13, stack offset = 2224
In try at depth 6: stack frame count = 15, stack offset = 2592
In try at depth 7: stack frame count = 17, stack offset = 2960
In try at depth 8: stack frame count = 19, stack offset = 3328
In try at depth 9: stack frame count = 21, stack offset = 3696
In try at depth 10: stack frame count = 23, stack offset = 4064
In catch at depth 10: stack frame count = 23, stack offset = 13024
In catch at depth 9: stack frame count = 21, stack offset = 21888
In catch at depth 8: stack frame count = 19, stack offset = 30752
In catch at depth 7: stack frame count = 17, stack offset = 39616
In catch at depth 6: stack frame count = 15, stack offset = 48480
In catch at depth 5: stack frame count = 13, stack offset = 57344
In catch at depth 4: stack frame count = 11, stack offset = 66208
In catch at depth 3: stack frame count = 9, stack offset = 75072
In catch at depth 2: stack frame count = 7, stack offset = 83936
In catch at depth 1: stack frame count = 5, stack offset = 92800
In catch at depth 0: stack frame count = 3, stack offset = 101664
Finished with System.InvalidOperationException
Run Code Online (Sandbox Code Playgroud)

哎呀.是的,当我们处理异常时,堆栈实际上会增长.

_maxDepth = 1000,执行结束于:

In catch at depth 933: stack frame count = 1869, stack offset = 971232
In catch at depth 932: stack frame count = 1867, stack offset = 980096
In catch at depth 931: stack frame count = 1865, stack offset = 988960
In catch at depth 930: stack frame count = 1863, stack offset = 997824

Process is terminated due to StackOverflowException.
Run Code Online (Sandbox Code Playgroud)

所以我们自己的代码使用了大约997824字节的堆栈空间,这非常接近Windows上1 MB的默认线程堆栈大小.调用CLR代码应该弥补差异.


您可能知道,SEH异常分两次处理:

  • 第一遍(过滤)找到能够处理异常的第一个异常处理程序.在C#中,这基本上检查catch子句是否匹配正确的异常类型,并执行when部分catch (...) when (...)是否存在.
  • 第二遍(展开)实际上处理异常.

以下是关于展开过程的第二篇文章:

发生异常时,系统会遍历EXCEPTION_REGISTRATION结构列表,直到找到异常的处理程序.找到处理程序后,系统会再次遍历列表,直至处理异常的节点.在第二次遍历期间,系统第二次调用每个处理函数.关键的区别在于,在第二次调用中,值2在异常标志中设置.该值对应于EH_UNWINDING.

[...]

处理异常并调用所有先前的异常帧后,在处理回调决定的任何地方继续执行.

这只能证实第一篇文章的内容.

第一遍需要保留故障堆栈以便能够检查其状态,并且能够在故障指令上恢复执行(是的,这是一件事 - 它是非常低级别但是你可以修补错误的原因并恢复执行,就像第一个地方没有错误一样).

第二遍的实现与第一遍一样,除了处理程序现在获得EH_UNWINDING标志.这意味着故障堆栈仍然保留在该点,直到最终处理程序决定恢复执行的位置.


对于32位进程,堆栈指针移动36个字节,但是在展开堆栈时,64位进程的移动速度高达8976个字节.对此有何解释?

好问题!

那是因为32位和64位SEH完全不同.这是一些阅读材料(强调我的):

因为在x86上,每个使用SEH的函数都将上述结构作为其序言的一部分,所以x86被称为使用基于帧的异常处理.这种方法存在一些问题:

  • 由于异常信息存储在堆栈中,因此容易受到缓冲区溢出攻击.
  • 高架.例外是例外,这意味着在例外情况下不会发生例外情况.无论如何,每次输入使用SEH的功能时,都会执行这些额外的指令.

因为x64有机会消除了数十年来一直悬挂的许多残骸,SEH进行了大修,解决了上述两个问题.在x64上,SEH已成为基于表的,这意味着在编译源代码时,会创建一个表,该表完全描述了模块中的所有异常处理代码.然后将该表存储为PE头的一部分.如果发生异常,Windows将解析异常表以查找要执行的相应异常处理程序.由于异常处理信息安全地隐藏在PE头中,因此不再容易受到缓冲区溢出攻击.此外,由于异常表是作为编译过程的一部分生成的,因此在正常处理期间不会产生运行时开销(以推送和弹出指令的形式).

当然,基于表的异常处理方案有其自身的几个消极方面.例如,基于表的方案往往比基于堆栈的方案在内存中占用更多空间.此外,虽然减少了正常执行路径中的开销,但处理异常所需的开销明显高于基于帧的方法.就像生活中的一切一样,在评估基于表格或基于框架的异常处理方法是否"最佳"时,需要考虑权衡取舍.

简而言之,快乐路径已经在x64中进行了优化,而特殊路径变得更慢.如果你问我,那是件好事.

这是我之前链接的第一篇文章中的另一个引文:

IA64和AMD64都有一个异常处理模型,可以避免依赖于在TLS中启动并穿过堆栈的显式处理程序链.相反,异常处理依赖于以下事实:在64位系统上,我们可以完美地展开堆栈.而这种能力本身是由于这些芯片严重受限于它们支持的调用约定.

[...]

无论如何,在64位系统上,堆栈上的激活记录与适用于它的异常记录之间的对应关系不是通过FS:[0]链实现的.相反,堆栈的展开揭示了对应于特定激活记录的代码地址.在表中查找方法的这些指令指针,以查明是否存在覆盖这些代码地址的任何_try/__ except/__ finally子句.该表还通过描述方法epilog的操作来指示如何继续展开.

是的.完全不同的方法.

但是让我们看一下x64调用堆栈,看看堆栈空间的实际使用位置.我修改Calculate如下:

private int Calculate(int depth)
{
    try
    {
        if (depth >= _maxDepth)
            throw new InvalidOperationException("Max. recursion depth reached.");

        return Calculate2(depth + 1);
    }
    catch
    {
        if (depth == _maxDepth)
        {
            Console.ReadLine();
        }

        throw;
    }
}
Run Code Online (Sandbox Code Playgroud)

我打开一个断点,Console.ReadLine();看看本调用堆栈以及每个帧上堆栈指针寄存器的值:

本机调用堆栈

地址fffffffffffffffe0000000000008000我看起来很奇怪,但无论如何这表明每帧消耗多少堆栈空间.Windows Native API(ntdll.dll)中正在发生很多事情,CLR增加了一些.

就内部Windows的内容而言,我们运气不佳,因为源代码不公开.但我们至少可以看一下clr.dll!ClrUnwindEx,因为该函数非常小但使用了相当多的堆栈空间:

void ClrUnwindEx(EXCEPTION_RECORD* pExceptionRecord, UINT_PTR ReturnValue, UINT_PTR TargetIP, UINT_PTR TargetFrameSp)
{
    PVOID TargetFrame = (PVOID)TargetFrameSp;

    CONTEXT ctx;
    RtlUnwindEx(TargetFrame,
                (PVOID)TargetIP,
                pExceptionRecord,
                (PVOID)ReturnValue, // ReturnValue
                &ctx,
                NULL);      // HistoryTable

    // doesn't return
    UNREACHABLE();
}
Run Code Online (Sandbox Code Playgroud)

CONTEXT在堆栈上定义了一个变量,它是一个大型结构.我只能假设64位SEH函数使用它们的堆栈空间用于类似目的.

现在让我们将它与32位调用堆栈进行比较:

32位调用堆栈

正如您所看到的,它与64位完全不同.

出于好奇,我测试了一个简单的C++程序的行为:

#include "stdafx.h"
#include <iostream>

__declspec(noinline) static char* GetSomePointerNearTheTopOfTheStack()
{
    char dummy;
    return &dummy;
}

int main()
{
    auto start = GetSomePointerNearTheTopOfTheStack();

    try
    {
        throw 42;
    }
    catch (...)
    {
        auto here = GetSomePointerNearTheTopOfTheStack();
        std::cout << "Difference in " << (sizeof(char*) * 8) << "-bit: " << (start - here) << std::endl;
    }

    return 0;
}
Run Code Online (Sandbox Code Playgroud)

结果如下:

Difference in 32-bit: 2224
Difference in 64-bit: 9744
Run Code Online (Sandbox Code Playgroud)

这进一步表明它不仅仅是一个CLR的东西,而是由于潜在的SEH实现差异.