Reg*_*lez 7 c# compiler-construction cil
问题是有关C#语言规范和CIL语言规范,以及Microsoft和Mono的C#编译器行为。
我正在构建一些可在CIL上运行的代码分析工具(无论如何)。
考虑一些代码示例,我注意到代码语句(try / catch,ifelse,ifthen,loop,...)生成MSIL的连接块。
但是我想确定我不能编写产生非连接MSIL的C#代码构造。更具体地说,我可以编写任何转换为(类似)的C#语句:
IL_0000:
IL_0001:
IL_0002:
// hole
IL_001a:
IL_001b:
Run Code Online (Sandbox Code Playgroud)
我已经使用goto嵌套循环尝试过一些奇怪的东西,但是也许我不像某些用户那样生气。
Eri*_*ert 13
当然,这很简单。就像是:
static void M(bool x)
{
if (x)
return;
else
M(x);
return;
}
Run Code Online (Sandbox Code Playgroud)
如果在调试模式下进行编译,则会得到
IL_0000: nop
IL_0001: ldarg.0
IL_0002: stloc.0
IL_0003: ldloc.0
IL_0004: brfalse.s IL_0008
IL_0006: br.s IL_0011
IL_0008: ldarg.0
IL_0009: call void A::M(bool)
IL_000e: nop
IL_000f: br.s IL_0011
IL_0011: ret
Run Code Online (Sandbox Code Playgroud)
该if声明从去0001到0009,和的后果if是转到到0011; 这两个return语句是相同的代码,因此在nop和主体之间存在一个包含“ a” 和无条件分支的“空洞” if。
更一般而言,对于C#编译器生成的IL的布局,您永远不应假设任何事情。编译器不保证产生的IL将是合法的,并且在安全的情况下也可验证的。
You say you are writing some code analysis tools; as the author of significant portions of the C# analyzer, and someone who worked on third-party analysis tools at Coverity, a word of advice: for the majority of questions you typically want answered about C# programs, the parse tree produced by Roslyn is the entity you wish to analyze, not the IL. The parse tree is a concrete syntax tree; it is one-to-one with every character in the source code. It can be very difficult to map optimized IL back to the original source code, and it can be very easy to produce false positives in an IL analysis.
Put another way: source-to-IL is semantics-preserving but also information-losing; you typically want to analyze the artifact that has the most information in it.
If you must, for whatever reason, operate your analyzer at the IL level, your first task should probably be to find the boundaries of the basic blocks, particularly if you are analyzing reachability properties.
A "basic block" is a contiguous chunk of IL where the end point of the block does not "carry on" to the following instruction -- because it is a branch, return or throw, for instance -- and there are no branches into the block to anywhere except the first instruction.
You can then form a graph of basic blocks for each method, indicating which ones can possible transfer control to which other blocks. This "raises the level" of your analysis; instead of analyzing the effects of a sequence of IL instructions, now you're analyzing the effects of a graph of basic blocks.
If you say more about what sorts of analysis you're doing I can advise further.