语法的替代版本无法按照我的意愿工作

Ste*_*ieD 5 grammar raku

这段代码$string按照我想要的方式解析:

\n
#! /usr/bin/env raku\n\nmy $string = q:to/END/;\naaa bbb   # this has trailing spaces which I want to keep\n\n       kjkjsdf\nkjkdsf\nEND\n\ngrammar Markdown {\n    token TOP {  ^ ([ <blank> | <text> ])+ $ }\n    token blank { [ \\h* <.newline> ]  }\n    token text { <indent> <content> }\n    token indent { \\h* }\n    token newline { \\n }\n    token content { \\N*? <trailing>* <.newline> } \n    token trailing { \\h+ }\n}\n\nmy $match = Markdown.parse($string);\n$match.say;\n
Run Code Online (Sandbox Code Playgroud)\n

输出

\n
\xef\xbd\xa2aaa bbb\n\n       kjkjsdf\nkjkdsf\n\xef\xbd\xa3\n 0 => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n  text => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n   indent => \xef\xbd\xa2\xef\xbd\xa3\n   content => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n    trailing => \xef\xbd\xa2   \xef\xbd\xa3\n 0 => \xef\xbd\xa2\n\xef\xbd\xa3\n  blank => \xef\xbd\xa2\n\xef\xbd\xa3\n 0 => \xef\xbd\xa2       kjkjsdf\n\xef\xbd\xa3\n  text => \xef\xbd\xa2       kjkjsdf\n\xef\xbd\xa3\n   indent => \xef\xbd\xa2       \xef\xbd\xa3\n   content => \xef\xbd\xa2kjkjsdf\n\xef\xbd\xa3\n 0 => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\n  text => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\n   indent => \xef\xbd\xa2\xef\xbd\xa3\n   content => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\n
Run Code Online (Sandbox Code Playgroud)\n

现在,我遇到的唯一问题是我希望该级别与和> 捕获<trailing>处于同一层次结构级别。<indent><content

\n

所以我尝试了这个语法:

\n
grammar Markdown {\n    token TOP {  ^ ([ <blank> | <text> ])+ $ }\n    token blank { [ \\h* <.newline> ]  }\n    token text { <indent> <content> <trailing>* <.newline> }\n    token indent { \\h* }\n    token newline { \\n }\n    token content { \\N*?  } \n    token trailing { \\h+ }\n}\n
Run Code Online (Sandbox Code Playgroud)\n

然而,它破坏了解析。所以我尝试了这个:

\n
    token TOP {  ^ ([ <blank> | <text> ])+ $ }\n    token blank { [ \\h* <.newline> ]  }\n    token text { <indent> <content>*? <trailing>* <.newline> }\n    token indent { \\h* }\n    token newline { \\n }\n    token content { \\N  } \n    token trailing { \\h+ }\n
Run Code Online (Sandbox Code Playgroud)\n

并得到:

\n
 0 => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n  text => \xef\xbd\xa2aaa bbb\n\xef\xbd\xa3\n   indent => \xef\xbd\xa2\xef\xbd\xa3\n   content => \xef\xbd\xa2a\xef\xbd\xa3\n   content => \xef\xbd\xa2a\xef\xbd\xa3\n   content => \xef\xbd\xa2a\xef\xbd\xa3\n   content => \xef\xbd\xa2 \xef\xbd\xa3\n   content => \xef\xbd\xa2b\xef\xbd\xa3\n   content => \xef\xbd\xa2b\xef\xbd\xa3\n   content => \xef\xbd\xa2b\xef\xbd\xa3\n   trailing => \xef\xbd\xa2   \xef\xbd\xa3\n 0 => \xef\xbd\xa2\n\xef\xbd\xa3\n  blank => \xef\xbd\xa2\n\xef\xbd\xa3\n 0 => \xef\xbd\xa2       kjkjsdf\n\xef\xbd\xa3\n  text => \xef\xbd\xa2       kjkjsdf\n\xef\xbd\xa3\n   indent => \xef\xbd\xa2       \xef\xbd\xa3\n   content => \xef\xbd\xa2k\xef\xbd\xa3\n   content => \xef\xbd\xa2j\xef\xbd\xa3\n   content => \xef\xbd\xa2k\xef\xbd\xa3\n   content => \xef\xbd\xa2j\xef\xbd\xa3\n   content => \xef\xbd\xa2s\xef\xbd\xa3\n   content => \xef\xbd\xa2d\xef\xbd\xa3\n   content => \xef\xbd\xa2f\xef\xbd\xa3\n 0 => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\n  text => \xef\xbd\xa2kjkdsf\n\xef\xbd\xa3\n   indent => \xef\xbd\xa2\xef\xbd\xa3\n   content => \xef\xbd\xa2k\xef\xbd\xa3\n   content => \xef\xbd\xa2j\xef\xbd\xa3\n   content => \xef\xbd\xa2k\xef\xbd\xa3\n   content => \xef\xbd\xa2d\xef\xbd\xa3\n   content => \xef\xbd\xa2s\xef\xbd\xa3\n   content => \xef\xbd\xa2f\xef\xbd\xa3\n
Run Code Online (Sandbox Code Playgroud)\n

这非常接近我想要的,但它会产生分解<content>成单个字母的不良效果,这并不理想。事后我可以通过按摩$match物体很容易地解决这个问题,但我想尝试提高我的语法技能。

\n

wam*_*mba 6

又快又脏

my $string = q:to/END/;
aaa bbb  

       kjkjsdf
kjkdsf
END

grammar Markdown {
    token TOP {  ^ ([ <blank> | <text> ])+ $ }
    token blank { [ \h* <.newline> ]  }
    token text { <indent>? $<content>=\N*? <trailing>? <.newline> }
    token indent { \h+ }
    token newline { \n }
    token trailing { \h+ }
}

my $match = Markdown.parse($string);
$match.say;
Run Code Online (Sandbox Code Playgroud)

前瞻断言

my $string = q:to/END/;
aaa bbb  

       kjkjsdf
kjkdsf
END

grammar Markdown {
    token TOP {  ^ ([ <blank> | <text> ])+ $ }
    token blank { [ \h* <.newline> ]  }
    token text { <indent>? <content> <trailing>? <.newline> }
    token indent { \h+ }
    token newline { \n }
    token content { [<!before <trailing>> \N]+  }
    token trailing { \h+ $$ }
}

my $match = Markdown.parse($string);
$match.say;
Run Code Online (Sandbox Code Playgroud)

一点重构

my $string = q:to/END/;
aaa bbb  

       kjkjsdf
kjkdsf
END

grammar Markdown {
    token TOP { ( <blank> | <text> )+ %% \n }
    token blank { ^^ \h* $$  }
    token text { <indent>? <content> <trailing>? }
    token indent { ^^ \h+ }
    token content { [<!before <trailing>> \N]+  }
    token trailing { \h+ $$ }
}

my $match = Markdown.parse($string);
$match.say;
Run Code Online (Sandbox Code Playgroud)