Jen*_*nny 1 yacc bnf bison flex-lexer
我正在尝试从BNF Grammar编写一个Flex/Bison文件.但是,当我尝试编译时出现错误,我不确定如何调试它们.
BNF Grammer:
<exp>::=<list> | head(<list>)
<list>::=<num1>::<list> | <list>@<list> | tail(<list>) | [<numlist>]
<numlist>::=<empty> | <num1><num2>
<num2>::=<empty> | ,<num1><num2>
<num1>::=<D1><N> | <N> | head(<list>)
<D1>::=<D1><N> | <D2>
<D2>::=[1-9]
<N>::=[0-9]
Run Code Online (Sandbox Code Playgroud)
我的Flex文件
%option noyywrap
%{
#include <stdlib.h>
#include <list>
using namespace std;
#define YYSTYPE list<int>
#include "listop.tab.h"
%}
hd head
tl tail
ap [@()\[\]]
con ::
n [0-9]
d2 [1-9]
d1 ({d2}{n}+)|{d2}
ws[ \t\n]+
%%
{d1} yylval.clear(); yylval.push_front(atoi(yytext)); return D1;
{n} yylval.clear(); yylval.push_front(atoi(yytext)); return N;
{hd} return HEAD;
{tl} return TAIL;
{ap} return *yytext;
{con} return CON;
{ws} /* eat up white spaces */
%%
Run Code Online (Sandbox Code Playgroud)
我的野牛文件
%{
#include <math.h>
#include <stdio.h>
#include <iostream>
#include <list>
using namespace std;
#define YYSTYPE list<int>
void outputlist(list<int>);
int yylex(void);
int yyerror(const char*);
list<int> a;
%}
%token N
%token D1
%token HEAD
%token TAIL
%right CON
%left '@'
%% /* Grammer rules and actions follow */
input: /* empty */
| input exp ;
exp: list '\n' {outputlist($1);}
| HEAD '(' list ')' '\n' {cout <<$3.front();} ;
list: num1 CON list {$3.push_front($1.front()); $$=$3;}
| list '@' list {$1.splice($1.end(),$3); $$=$1;}
| TAIL '(' list ')'{$$ = $3;}
| '[' numlist ']' {$$ = $2;};
numlist: /* empty */
| num1 num2 {$1.splice($1.end(), $2); $$=$1;};
num2: /* empty */
| ',' num1 num2 {$2.splice($2.end(),$3); $$=$2;};
num1: D1 N { $1.splice($1.end(), $2); $$=$1;}
| N { $$ = $1;}
| HEAD '(' list ')' {list<int> a; a.push_front($3.front()); $$=a;};
%%
int main() { return yyparse();}
int yyerror(const char* s) { cout << "error" << endl; return 0;}
void outputlist(list<int> list1)
{
cout << '[';
for (int i = list1.size(); i > 1; i --)
{
cout << list1.front() << ',';
list1.pop_front();
}
cout << list1.front();
list1.pop_front();
cout << ']' << endl;
}
Run Code Online (Sandbox Code Playgroud)
由于YYSTYPE已被定义为list<int>两个文件的C声明部分,我使用库函数来插入/删除头部,获取列表的第一个元素等.
任何帮助表示赞赏.谢谢.
编辑:我已经编辑了上面的flex&bison文件.我的程序现在运行,但我没有得到正确的输出.这就是我得到的:
输入:头部([5,2,4])
输出:,错误
正确输出:5
使用可用的工具调试代码并不困难,例如bison,C++编译器和bison手册.当您似乎正在学习如何在简单练习中使用flex和bison时,您可能需要有关如何完成调试的教程.如果您不需要教程,也许可能是其他读者.以下是调试示例的教程.我目前正在使用计算机科学课程这样做,因此它将有一个教师语气.
首先,我看了你的BNF.我注意到你似乎没有严格遵守BNF表示法,布局有点不整洁.更好的版本是:
<exp> ::= <list>
| head ( <list> )
<list> ::= <num1> :: <list>
| <list> @ <list>
| tail ( <list> )
| [ <numlist> ]
<numlist> ::= <empty>
| <num1> <num2>
<num2> ::= <empty>
| , <num1> <num2>
<num1> ::= <D1> <N>
| <N> | head ( <list> )
<D1> ::= <D1> <N>
| <D2>
<D2> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<N> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<empty> ::=
Run Code Online (Sandbox Code Playgroud)
了解布局如何提高可读性?您已经将[]符号用于两个不同的目的,就像它们自己一样,并指示一个字符集.字符集是flex表示法.这些[]字符在扩展BNF中用作元字符,但它们用于表示选项而不是集合.您尚未定义<empty>非终端符号.尽管此示例遵循严格的表示法,但为了防止终端和元符号之间的混淆,将使用引号,如:
<exp> ::= <list>
| head "(" <list> ")"
<list> ::= <num1> "::" <list>
| <list> "@" <list>
| tail "(" <list> ")"
| "[" <numlist> "]"
<numlist> ::= <empty>
| <num1> <num2>
<num2> ::= <empty>
| "," <num1> <num2>
<num1> ::= <D1> <N>
| <N> | head "(" <list> ")"
<D1> ::= <D1> <N>
| <D2>
<D2> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<N> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<empty> ::=
Run Code Online (Sandbox Code Playgroud)
现在让我们来看看词法分析.如果我们通过@rici进行修复建议,并添加缺少的逗号,则我们有以下规则:
ap [@()\[\],]
Run Code Online (Sandbox Code Playgroud)
现在,bison具有内置的调试功能,这对于理解解析如何工作非常有用.为代码启用此功能可以清楚地看到它失败的地方.首先,我们必须启用跟踪,如手册中所述.我们需要做两件事:
yydebug非零-t(或--debug)选项构建解析器因此,在您的listop.y文件中,您应该将main()函数更改为:
int main() {
#ifdef YYDEBUG
extern int yydebug;
yydebug = 1;
#endif
return yyparse();}
Run Code Online (Sandbox Code Playgroud)
重建野牛:
bison listop.y -d -t
Run Code Online (Sandbox Code Playgroud)
现在运行flex并重新编译:
flex listop.l
g++ -o listop.exe listop.tab.c lex.yy.c -DYYDEBUG -lfl
Run Code Online (Sandbox Code Playgroud)
现在,当我们运行时,我们得到一些诊断输出(供您输入head([5,2,4])):
Starting parse
Entering state 0
Reducing stack by rule 1 (line 25):
-> $$ = nterm input ()
Stack now 0
Entering state 1
Reading a token: head([5,2,4])
Next token is token HEAD ()
Shifting token HEAD ()
Entering state 5
Reading a token: Next token is token '(' ()
Shifting token '(' ()
Entering state 12
Reading a token: Next token is token '[' ()
Shifting token '[' ()
Entering state 7
Reading a token: Next token is token D1 ()
Shifting token D1 ()
Entering state 4
Reading a token: Next token is token ',' ()
error
Error: popping token D1 ()
Stack now 0 1 5 12 7
Error: popping token '[' ()
Stack now 0 1 5 12
Error: popping token '(' ()
Stack now 0 1 5
Error: popping token HEAD ()
Stack now 0 1
Error: popping nterm input ()
Stack now 0
Cleanup: discarding lookahead token ',' ()
Stack now 0
Run Code Online (Sandbox Code Playgroud)
从这一行可以看出:
Reading a token: Next token is token D1 ()
Run Code Online (Sandbox Code Playgroud)
数字5已被D1词法分析器作为令牌返回.无论如何咨询BNF语法,我们都可以看到它应该被视为令牌N.由于D1并N包含大致相同的数字,因此可以选择其中一个.它为什么选择D1?这是因为模式/动作规则的排序.较高的规则优先于较低的规则.要解决此问题,有必要更改规则的顺序:
{d1} yylval.clear(); yylval.push_front(atoi(yytext)); return D1;
{n} yylval.clear(); yylval.push_front(atoi(yytext)); return N;
Run Code Online (Sandbox Code Playgroud)
至:
{n} yylval.clear(); yylval.push_front(atoi(yytext)); return N;
{d1} yylval.clear(); yylval.push_front(atoi(yytext)); return D1;
Run Code Online (Sandbox Code Playgroud)
然后再次构建所有部分.如果我们再试一次,我们得到以下更长的痕迹:
Starting parse
Entering state 0
Reducing stack by rule 1 (line 25):
-> $$ = nterm input ()
Stack now 0
Entering state 1
Reading a token: head([5,2,4])
Next token is token HEAD ()
Shifting token HEAD ()
Entering state 5
Reading a token: Next token is token '(' ()
Shifting token '(' ()
Entering state 12
Reading a token: Next token is token '[' ()
Shifting token '[' ()
Entering state 7
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
$1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7
Entering state 16
Reading a token: Next token is token ',' ()
Shifting token ',' ()
Entering state 24
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
$1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7 16 24
Entering state 31
Reading a token: Next token is token ',' ()
Shifting token ',' ()
Entering state 24
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
$1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7 16 24 31 24
Entering state 31
Reading a token: Next token is token ']' ()
Reducing stack by rule 11 (line 35):
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16 24 31 24 31
Entering state 34
Reducing stack by rule 12 (line 36):
$1 = token ',' ()
$2 = nterm num1 ()
$3 = nterm num2 ()
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16 24 31
Entering state 34
Reducing stack by rule 12 (line 36):
$1 = token ',' ()
$2 = nterm num1 ()
$3 = nterm num2 ()
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16
Entering state 25
Reducing stack by rule 10 (line 34):
$1 = nterm num1 ()
$2 = nterm num2 ()
-> $$ = nterm numlist ()
Stack now 0 1 5 12 7
Entering state 15
Next token is token ']' ()
Shifting token ']' ()
Entering state 23
Reducing stack by rule 8 (line 32):
$1 = token '[' ()
$2 = nterm numlist ()
$3 = token ']' ()
-> $$ = nterm list ()
Stack now 0 1 5 12
Entering state 20
Reading a token: Next token is token ')' ()
Shifting token ')' ()
Entering state 28
Reading a token:
Run Code Online (Sandbox Code Playgroud)
但是,5仍未输出预期的结果().检查操作规则,我们可以看到outputlist仅在'\n'匹配令牌且跟踪未显示任何'\n'令牌时才调用该方法.这是因为它们在词法分析器中被删除为空格.需要更改flex规则来解决此问题:
ap [@()\[\],\n]
ws[ \t]+
Run Code Online (Sandbox Code Playgroud)
并重新重建.它现在有效,输出正确的值:
Starting parse
Entering state 0
Reducing stack by rule 1 (line 25):
-> $$ = nterm input ()
Stack now 0
Entering state 1
Reading a token: head([5,2,4])
Next token is token HEAD ()
Shifting token HEAD ()
Entering state 5
Reading a token: Next token is token '(' ()
Shifting token '(' ()
Entering state 12
Reading a token: Next token is token '[' ()
Shifting token '[' ()
Entering state 7
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
$1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7
Entering state 16
Reading a token: Next token is token ',' ()
Shifting token ',' ()
Entering state 24
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
$1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7 16 24
Entering state 31
Reading a token: Next token is token ',' ()
Shifting token ',' ()
Entering state 24
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
$1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7 16 24 31 24
Entering state 31
Reading a token: Next token is token ']' ()
Reducing stack by rule 11 (line 35):
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16 24 31 24 31
Entering state 34
Reducing stack by rule 12 (line 36):
$1 = token ',' ()
$2 = nterm num1 ()
$3 = nterm num2 ()
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16 24 31
Entering state 34
Reducing stack by rule 12 (line 36):
$1 = token ',' ()
$2 = nterm num1 ()
$3 = nterm num2 ()
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16
Entering state 25
Reducing stack by rule 10 (line 34):
$1 = nterm num1 ()
$2 = nterm num2 ()
-> $$ = nterm numlist ()
Stack now 0 1 5 12 7
Entering state 15
Next token is token ']' ()
Shifting token ']' ()
Entering state 23
Reducing stack by rule 8 (line 32):
$1 = token '[' ()
$2 = nterm numlist ()
$3 = token ']' ()
-> $$ = nterm list ()
Stack now 0 1 5 12
Entering state 20
Reading a token: Next token is token ')' ()
Shifting token ')' ()
Entering state 28
Reading a token: Next token is token '\n' ()
Shifting token '\n' ()
Entering state 32
Reducing stack by rule 4 (line 28):
$1 = token HEAD ()
$2 = token '(' ()
$3 = nterm list ()
$4 = token ')' ()
$5 = token '\n' ()
5-> $$ = nterm exp ()
Stack now 0 1
Entering state 8
Reducing stack by rule 2 (line 26):
$1 = nterm input ()
$2 = nterm exp ()
-> $$ = nterm input ()
Stack now 0
Entering state 1
Reading a token: ^Z
Now at end of input.
Shifting token $end ()
Entering state 2
Stack now 0 1 2
Cleanup: popping token $end ()
Cleanup: popping nterm input ()
Run Code Online (Sandbox Code Playgroud)
实际上,虽然调试使程序运行,但它仍然是一个糟糕的方法.它没有使用词法分析器来构建令牌和解析器以匹配语法.所有数字识别都应该在flex代码中完成.例如,为什么不只是像这样的词法分析器:
%option noyywrap
%{
#include <stdlib.h>
#include <list>
using namespace std;
#define YYSTYPE list<int>
#include "listop.tab.h"
%}
hd head
tl tail
ap [@()\[\],\n]
con ::
n [0-9]
d2 [1-9]
num ({d2}+{n}?)|{n}
ws[ \t]+
%%
{num} return NUM; yylval.push_front(atoi(yytext));
{hd} return HEAD;
{tl} return TAIL;
{ap} return *yytext;
{con} return CON;
{ws} /* eat up white spaces */
%%
Run Code Online (Sandbox Code Playgroud)
然后,您可以稍微简化解析器中的语法,并使整体更简单的实现.
我将把这部分作为练习.
| 归档时间: |
|
| 查看次数: |
3154 次 |
| 最近记录: |