Lyu*_*mil 6 compiler-construction compiler-theory
我正在编写编写器并学习语法分析背后的理论.我发现即使它是理解识别算法的关键概念,但网上的信息相当差.似乎StackOverflow处于解决此问题的独特位置.
语法的前瞻设置是根据每个非终端的前瞻集来定义的,而后者又依赖于每个生产的先行集.确定先行集可以帮助我们确定语法是否为LL(1),如果是,我们需要为它构造递归下降解析器所需的信息.
定义: LOOKAHEAD(X - >α)和LOOKAHEAD(X):
LOOKAHEAD(X -> ?) = FIRST(?) U FOLLOW(X), if NULLABLE(?)
LOOKAHEAD(X -> ?) = FIRST(?), if not NULLABLE(?)
LOOKAHEAD(X) = LOOKAHEAD(X -> ?) U LOOKAHEAD(X -> ?) U LOOKAHEAD(X -> ?)
Run Code Online (Sandbox Code Playgroud)
其中FIRST(α)是一组α可以首先,终端FOLLOW(X)是该组的终端的可来到一个后X在语法任意位置,可以为空(α)是α是否可以得到的空序列终端(表示为ε).以下定义取自Torben Mogensen的免费书籍"编译器设计基础".请参阅下面的示例.
定义: NULLABLE(X):
NULLABLE(?) = true
NULLABLE(x) = false, if x is a terminal
NULLABLE(??) = NULLABLE(?) and NULLABLE(?)
NULLABLE(P) = NULLABLE(?_1) or NULLABLE(?_2) or ... or NULLABLE(?_n),
if P is a non-terminal and the right-hand-sides
of all its productions are ?_1, ?_2, ..., ?_n.
Run Code Online (Sandbox Code Playgroud)
定义: FIRST(X):
FIRST(?) = Ø
FIRST(x) = {x}, assuming x is a terminal
FIRST(??) = FIRST(?) U FIRST(?), if NULLABLE(?)
= FIRST(?), if not NULLABLE(?)
FIRST(P) = FIRST(?_1) U FIRST(?_2) U ... U FIRST(?_n),
if P is a non-terminal and the right-hand-sides
of all its productions are ?_1, ?_2, ..., ?_n.
Run Code Online (Sandbox Code Playgroud)
定义:关注 (X):
终端符号a是在FOLLOW(X)当且仅当存在来自语法的开始符号S上推导这样该S⇒αXAβ,其中α和β是文法符号的(可能为空)的序列.
直觉: 跟随(X):
看看语法中出现X的位置.所有可以跟随它的终端(直接或通过任何级别的递归)都在FOLLOW(X)中.另外,如果X在生产结束时发生(例如
A -> foo X
),或者之后是可以减少到ε(例如A -> foo X B
和B -> ?
)的其他东西,那么无论A可以跟随什么,X也可以跟随(即FOLLOW(A) ? FOLLOW(X)
).
请参阅Torben的书中确定FOLLOW(X)的方法,并在下面进行演示.
一个例子:
E -> n A
A -> E B
A -> ?
B -> + A
B -> * A
Run Code Online (Sandbox Code Playgroud)
首先,NULLABLE和FIRST并确定:
NULLABLE(E) = NULLABLE(n A) = NULLABLE(n) ? NULLABLE(A) = false
NULLABLE(A) = NULLABLE(E B) ? NULLABLE(?) = true
NULLABLE(B) = NULLABLE(+ A) ? NULLABLE(* A) = false
FIRST(E) = FIRST(n A) = {n}
FIRST(A) = FIRST(E B) U FIRST(?) = FIRST(E) U Ø = {n} (because E is not NULLABLE)
FIRST(B) = FIRST(+ A) U FIRST(* A) = FIRST(+) U FIRST(*) = {+, *}
Run Code Online (Sandbox Code Playgroud)
在确定FOLLOW之前,E' -> E $
添加生产,其中$
被视为"文件结束"非终端.然后确定FOLLOW:
FOLLOW(E): Let ? = $, so add the constraint that FIRST($) = {$} ? FOLLOW(E)
Let ? = B, so add the constraint that FIRST(B) = {+, *} ? FOLLOW(E)
FOLLOW(A): Let ? = ?, so add the constraint that FIRST(?) = Ø ? FOLLOW(A).
Because NULLABLE(?), add the constraint that FOLLOW(E) ? FOLLOW(A).
Let ? = ?, so add the constraint that FIRST(?) = Ø ? FOLLOW(A).
Because NULLABLE(?), add the constraint that FOLLOW(B) ? FOLLOW(A).
Let ? = ?, so add the constraint that FIRST(?) = Ø ? FOLLOW(A).
Because NULLABLE(?), add the constraint that FOLLOW(B) ? FOLLOW(A).
FOLLOW(B): Let ? = ?, so add the constraint that FIRST(?) = Ø ? FOLLOW(B).
Because NULLABLE(?), add the constraint that FOLLOW(A) ? FOLLOW(B).
Run Code Online (Sandbox Code Playgroud)
解决这些约束(也可以通过定点迭代实现),
{+, *, $} ? FOLLOW(E)
FOLLOW(E) ? FOLLOW(A)
FOLLOW(A) = FOLLOW(B)
FOLLOW(E) = FOLLOW(A) = FOLLOW(B) = {+, *, $}.
Run Code Online (Sandbox Code Playgroud)
现在可以确定每个生产的LOOKAHEAD:
LOOKAHEAD(E -> n A) = FIRST(n A) = {n} because ¬NULLABLE(n A)
LOOKAHEAD(A -> E B) = FIRST(E B) because ¬NULLABLE(E B)
= FIRST(E) = {n} because ¬NULLABLE(E)
LOOKAHEAD(A -> ?) = FIRST(?) U FOLLOW(A) because NULLABLE(?)
= Ø U {+, *, $} = {+, *, $}
LOOKAHEAD(B -> + A) = FIRST(+ A) because ¬NULLABLE(+ A)
= FIRST(+) = {+} because ¬NULLABLE(+)
LOOKAHEAD(B -> * A) = {*} for the same reason
Run Code Online (Sandbox Code Playgroud)
最后,可以确定每个非终端的LOOKAHEAD:
LOOKAHEAD(E) = LOOKAHEAD(E -> n A) = {n}
LOOKAHEAD(A) = LOOKAHEAD(A -> E B) U LOOKAHEAD(A -> ?) = {n} U {+, *, $}
LOOKAHEAD(B) = LOOKAHEAD(B -> + A) U LOOKAHEAD(B -> * A) = {+, *}
Run Code Online (Sandbox Code Playgroud)
通过这种知识,我们可以确定该语法不是LL(1),因为它的非终端具有重叠的先行集.(即,我们无法创建一次只读取一个符号的程序,并明确决定使用哪个生产.)