Håk*_*and 4 grammar perl6 raku
我开始编写BibTeX解析器.我想做的第一件事是解析一个支撑项目.例如,支撑项可以是作者字段或标题.字段中可能有嵌套的大括号.下面的代码并没有处理嵌套括号:
use v6;
my $str = q:to/END/;
author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},
END
$str .= chomp;
grammar ExtractBraced {
rule TOP {
'author=' <braced-item> .*
}
rule braced-item { '{' <-[}]>* '}' }
}
ExtractBraced.parse( $str ).say;
Run Code Online (Sandbox Code Playgroud)
输出:
?author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},?
braced-item => ?{Belayneh, M. and Geiger, S. and Matth{\"{a}?
Run Code Online (Sandbox Code Playgroud)
现在,为了使解析器接受嵌套大括号,我想保留当前解析的开括号数量的计数器,当遇到右大括号时,我们减少计数器.如果计数器达到零,我们假设我们已经解析了完整的项目.
为了遵循这个想法,我尝试拆分braced-item正则表达式,对每个char实现语法操作.(braced-item-char下面正则表达式的操作方法应该处理大括号计数器):
grammar ExtractBraced {
rule TOP {
'author=' <braced-item> .*
}
rule braced-item { '{' <braced-item-char>* '}' }
rule braced-item-char { <-[}]> }
}
Run Code Online (Sandbox Code Playgroud)
但是,现在突然解析失败了.可能是一个愚蠢的错误,但我不明白为什么它现在应该失败?
如果不知道你想要的结果数据,我会改变它看起来像这样:
my $str = ?author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},?;
grammar ExtractBraced {
token TOP {
'author='
$<author> = <.braced-item>
.*
}
token braced-item {
'{' ~ '}'
[
|| <- [{}] >+
|| <.before '{'> <.braced-item>
]*
}
}
ExtractBraced.parse( $str ).say;
Run Code Online (Sandbox Code Playgroud)
?author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},?
author => ?{Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.}?
Run Code Online (Sandbox Code Playgroud)
如果你想要更多的结构它可能看起来更像这样:
my $str = ?author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},?;
grammar ExtractBraced {
token TOP {
'author='
$<author> = <.braced-item>
.*
}
token braced-part {
|| <- [{}] >+
|| <.before '{'> <braced-item>
}
token braced-item {
'{' ~ '}'
<braced-part>*
}
}
class Print {
method TOP ($/){
make $<author>.made
}
method braced-part ($/){
make $<braced-item>.?made // ~$/
}
method braced-item ($/){
make [~] @<braced-part>».made
}
}
my $r = ExtractBraced.parse( $str, :actions(Print) );
say $r;
put();
say $r.made;
Run Code Online (Sandbox Code Playgroud)
?author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},?
author => ?{Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.}?
braced-part => ?Belayneh, M. and Geiger, S. and Matth?
braced-part => ?{\"{a}}?
braced-item => ?{\"{a}}?
braced-part => ?\"?
braced-part => ?{a}?
braced-item => ?{a}?
braced-part => ?a?
braced-part => ?i, S.K.?
Belayneh, M. and Geiger, S. and Matth\"ai, S.K.
Run Code Online (Sandbox Code Playgroud)
请注意,+on <-[{}]>+是一个优化,并且<before '{'>两者都可以省略,它仍然可以工作.
| 归档时间: |
|
| 查看次数: |
129 次 |
| 最近记录: |