这段代码$string按照我想要的方式解析:
#! /usr/bin/env raku\n\nmy $string = q:to/END/;\naaa bbb # this has trailing spaces which I want to keep\n\n kjkjsdf\nkjkdsf\nEND\n\ngrammar Markdown {\n token TOP { ^ ([ <blank> | <text> ])+ $ }\n token blank { [ \\h* <.newline> ] }\n token text { <indent> <content> }\n token indent { \\h* }\n token newline { \\n }\n token content { \\N*? <trailing>* <.newline> } \n token trailing { \\h+ }\n}\n\nmy $match = Markdown.parse($string);\n$match.say;\nRun Code Online (Sandbox Code Playgroud)\n输出
\n\xef\xbd\xa2aaa bbb\n\n kjkjsdf\nkjkdsf\n\xef\xbd\xa3\n 0 => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n text => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n indent => \xef\xbd\xa2\xef\xbd\xa3\n content => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n trailing => \xef\xbd\xa2 \xef\xbd\xa3\n 0 => \xef\xbd\xa2\n\xef\xbd\xa3\n blank => \xef\xbd\xa2\n\xef\xbd\xa3\n 0 => \xef\xbd\xa2 kjkjsdf\n\xef\xbd\xa3\n text => \xef\xbd\xa2 kjkjsdf\n\xef\xbd\xa3\n indent => \xef\xbd\xa2 \xef\xbd\xa3\n content => \xef\xbd\xa2kjkjsdf\n\xef\xbd\xa3\n 0 => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\n text => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\n indent => \xef\xbd\xa2\xef\xbd\xa3\n content => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\nRun Code Online (Sandbox Code Playgroud)\n现在,我遇到的唯一问题是我希望该级别与和> 捕获<trailing>处于同一层次结构级别。<indent><content
所以我尝试了这个语法:
\ngrammar Markdown {\n token TOP { ^ ([ <blank> | <text> ])+ $ }\n token blank { [ \\h* <.newline> ] }\n token text { <indent> <content> <trailing>* <.newline> }\n token indent { \\h* }\n token newline { \\n }\n token content { \\N*? } \n token trailing { \\h+ }\n}\nRun Code Online (Sandbox Code Playgroud)\n然而,它破坏了解析。所以我尝试了这个:
\n token TOP { ^ ([ <blank> | <text> ])+ $ }\n token blank { [ \\h* <.newline> ] }\n token text { <indent> <content>*? <trailing>* <.newline> }\n token indent { \\h* }\n token newline { \\n }\n token content { \\N } \n token trailing { \\h+ }\nRun Code Online (Sandbox Code Playgroud)\n并得到:
\n 0 => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n text => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n indent => \xef\xbd\xa2\xef\xbd\xa3\n content => \xef\xbd\xa2a\xef\xbd\xa3\n content => \xef\xbd\xa2a\xef\xbd\xa3\n content => \xef\xbd\xa2a\xef\xbd\xa3\n content => \xef\xbd\xa2 \xef\xbd\xa3\n content => \xef\xbd\xa2b\xef\xbd\xa3\n content => \xef\xbd\xa2b\xef\xbd\xa3\n content => \xef\xbd\xa2b\xef\xbd\xa3\n trailing => \xef\xbd\xa2 \xef\xbd\xa3\n 0 => \xef\xbd\xa2\n\xef\xbd\xa3\n blank => \xef\xbd\xa2\n\xef\xbd\xa3\n 0 => \xef\xbd\xa2 kjkjsdf\n\xef\xbd\xa3\n text => \xef\xbd\xa2 kjkjsdf\n\xef\xbd\xa3\n indent => \xef\xbd\xa2 \xef\xbd\xa3\n content => \xef\xbd\xa2k\xef\xbd\xa3\n content => \xef\xbd\xa2j\xef\xbd\xa3\n content => \xef\xbd\xa2k\xef\xbd\xa3\n content => \xef\xbd\xa2j\xef\xbd\xa3\n content => \xef\xbd\xa2s\xef\xbd\xa3\n content => \xef\xbd\xa2d\xef\xbd\xa3\n content => \xef\xbd\xa2f\xef\xbd\xa3\n 0 => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\n text => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\n indent => \xef\xbd\xa2\xef\xbd\xa3\n content => \xef\xbd\xa2k\xef\xbd\xa3\n content => \xef\xbd\xa2j\xef\xbd\xa3\n content => \xef\xbd\xa2k\xef\xbd\xa3\n content => \xef\xbd\xa2d\xef\xbd\xa3\n content => \xef\xbd\xa2s\xef\xbd\xa3\n content => \xef\xbd\xa2f\xef\xbd\xa3\nRun Code Online (Sandbox Code Playgroud)\n这非常接近我想要的,但它会产生分解<content>成单个字母的不良效果,这并不理想。事后我可以通过按摩$match物体很容易地解决这个问题,但我想尝试提高我的语法技能。
又快又脏
my $string = q:to/END/;
aaa bbb
kjkjsdf
kjkdsf
END
grammar Markdown {
token TOP { ^ ([ <blank> | <text> ])+ $ }
token blank { [ \h* <.newline> ] }
token text { <indent>? $<content>=\N*? <trailing>? <.newline> }
token indent { \h+ }
token newline { \n }
token trailing { \h+ }
}
my $match = Markdown.parse($string);
$match.say;
Run Code Online (Sandbox Code Playgroud)
前瞻断言
my $string = q:to/END/;
aaa bbb
kjkjsdf
kjkdsf
END
grammar Markdown {
token TOP { ^ ([ <blank> | <text> ])+ $ }
token blank { [ \h* <.newline> ] }
token text { <indent>? <content> <trailing>? <.newline> }
token indent { \h+ }
token newline { \n }
token content { [<!before <trailing>> \N]+ }
token trailing { \h+ $$ }
}
my $match = Markdown.parse($string);
$match.say;
Run Code Online (Sandbox Code Playgroud)
一点重构
my $string = q:to/END/;
aaa bbb
kjkjsdf
kjkdsf
END
grammar Markdown {
token TOP { ( <blank> | <text> )+ %% \n }
token blank { ^^ \h* $$ }
token text { <indent>? <content> <trailing>? }
token indent { ^^ \h+ }
token content { [<!before <trailing>> \N]+ }
token trailing { \h+ $$ }
}
my $match = Markdown.parse($string);
$match.say;
Run Code Online (Sandbox Code Playgroud)