前瞻套装的精确定义是什么?

Lyu*_*mil 6 compiler-construction compiler-theory

我正在编写编写器并学习语法分析背后的理论.我发现即使它是理解识别算法的关键概念,但网上的信息相当差.似乎StackOverflow处于解决此问题的独特位置.

Sim*_*ine 8

语法的前瞻设置是根据每个非终端的前瞻集来定义的,而后者又依赖于每个生产的先行集.确定先行集可以帮助我们确定语法是否为LL(1),如果是,我们需要为它构造递归下降解析器所需的信息.

定义: LOOKAHEAD(X - >α)LOOKAHEAD(X):

LOOKAHEAD(X -> ?) = FIRST(?) U FOLLOW(X), if NULLABLE(?)
LOOKAHEAD(X -> ?) = FIRST(?), if not NULLABLE(?)
LOOKAHEAD(X) = LOOKAHEAD(X -> ?) U LOOKAHEAD(X -> ?) U LOOKAHEAD(X -> ?)
Run Code Online (Sandbox Code Playgroud)

其中FIRST(α)是一组α可以首先,终端FOLLOW(X)是该组的终端的可来到一个后X在语法任意位置,可以为空(α)是α是否可以得到的空序列终端(表示为ε).以下定义取自Torben Mogensen的免费书籍"编译器设计基础".请参阅下面的示例.

定义: NULLABLE(X):

NULLABLE(?) = true
NULLABLE(x) = false, if x is a terminal
NULLABLE(??) = NULLABLE(?) and NULLABLE(?)
NULLABLE(P) = NULLABLE(?_1) or NULLABLE(?_2) or ... or NULLABLE(?_n),
               if P is a non-terminal and the right-hand-sides
               of all its productions are ?_1, ?_2, ..., ?_n.
Run Code Online (Sandbox Code Playgroud)

定义: FIRST(X):

FIRST(?) = Ø
FIRST(x) = {x}, assuming x is a terminal
FIRST(??) = FIRST(?) U FIRST(?), if NULLABLE(?)
          = FIRST(?), if not NULLABLE(?)
FIRST(P) = FIRST(?_1) U FIRST(?_2) U ... U FIRST(?_n),
               if P is a non-terminal and the right-hand-sides
               of all its productions are ?_1, ?_2, ..., ?_n.
Run Code Online (Sandbox Code Playgroud)

定义:关注 (X):

终端符号a是在FOLLOW(X)当且仅当存在来自语法的开始符号S上推导这样该S⇒αXAβ,其中α和β是文法符号的(可能为空)的序列.

直觉: 跟随(X):

看看语法中出现X的位置.所有可以跟随它的终端(直接或通过任何级别的递归)都在FOLLOW(X)中.另外,如果X在生产结束时发生(例如A -> foo X),或者之后是可以减少到ε(例如A -> foo X BB -> ?)的其他东西,那么无论A可以跟随什么,X也可以跟随(即FOLLOW(A) ? FOLLOW(X)).

请参阅Torben的书中确定FOLLOW(X)的方法,并在下面进行演示.

一个例子:

E -> n A
A -> E B
A -> ?
B -> + A
B -> * A
Run Code Online (Sandbox Code Playgroud)

首先,NULLABLEFIRST并确定:

NULLABLE(E) = NULLABLE(n A) = NULLABLE(n) ? NULLABLE(A) = false
NULLABLE(A) = NULLABLE(E B) ? NULLABLE(?) = true
NULLABLE(B) = NULLABLE(+ A) ? NULLABLE(* A) = false

FIRST(E) = FIRST(n A) = {n}
FIRST(A) = FIRST(E B) U FIRST(?) = FIRST(E) U Ø = {n} (because E is not NULLABLE)
FIRST(B) = FIRST(+ A) U FIRST(* A) = FIRST(+) U FIRST(*) = {+, *}
Run Code Online (Sandbox Code Playgroud)

在确定FOLLOW之前,E' -> E $添加生产,其中$被视为"文件结束"非终端.然后确定FOLLOW:

FOLLOW(E): Let ? = $, so add the constraint that FIRST($) = {$} ? FOLLOW(E)
           Let ? = B, so add the constraint that FIRST(B) = {+, *} ? FOLLOW(E)
FOLLOW(A): Let ? = ?, so add the constraint that FIRST(?) = Ø ? FOLLOW(A).
           Because NULLABLE(?), add the constraint that FOLLOW(E) ? FOLLOW(A).
           Let ? = ?, so add the constraint that FIRST(?) = Ø ? FOLLOW(A).
           Because NULLABLE(?), add the constraint that FOLLOW(B) ? FOLLOW(A).
           Let ? = ?, so add the constraint that FIRST(?) = Ø ? FOLLOW(A).
           Because NULLABLE(?), add the constraint that FOLLOW(B) ? FOLLOW(A).
FOLLOW(B): Let ? = ?, so add the constraint that FIRST(?) = Ø ? FOLLOW(B).
           Because NULLABLE(?), add the constraint that FOLLOW(A) ? FOLLOW(B).
Run Code Online (Sandbox Code Playgroud)

解决这些约束(也可以通过定点迭代实现),

    {+, *, $} ? FOLLOW(E)
    FOLLOW(E) ? FOLLOW(A)
    FOLLOW(A) = FOLLOW(B)

    FOLLOW(E) = FOLLOW(A) = FOLLOW(B) = {+, *, $}.
Run Code Online (Sandbox Code Playgroud)

现在可以确定每个生产的LOOKAHEAD:

LOOKAHEAD(E -> n A) = FIRST(n A) = {n}     because ¬NULLABLE(n A)
LOOKAHEAD(A -> E B) = FIRST(E B)           because ¬NULLABLE(E B)
                    = FIRST(E) = {n}       because ¬NULLABLE(E)
LOOKAHEAD(A -> ?)   = FIRST(?) U FOLLOW(A) because NULLABLE(?)
                    = Ø U {+, *, $} = {+, *, $}
LOOKAHEAD(B -> + A) = FIRST(+ A)           because ¬NULLABLE(+ A)
                    = FIRST(+) = {+}       because ¬NULLABLE(+)
LOOKAHEAD(B -> * A) = {*}                  for the same reason
Run Code Online (Sandbox Code Playgroud)

最后,可以确定每个非终端的LOOKAHEAD:

LOOKAHEAD(E) = LOOKAHEAD(E -> n A) = {n}
LOOKAHEAD(A) = LOOKAHEAD(A -> E B) U LOOKAHEAD(A -> ?)   = {n} U {+, *, $}
LOOKAHEAD(B) = LOOKAHEAD(B -> + A) U LOOKAHEAD(B -> * A) = {+, *}
Run Code Online (Sandbox Code Playgroud)

通过这种知识,我们可以确定该语法不是LL(1),因为它的非终端具有重叠的先行集.(即,我们无法创建一次只读取一个符号的程序,并明确决定使用哪个生产.)