在 EOS 停止 Raku 语法（字符串结尾）

Question

在 EOS 停止 Raku 语法（字符串结尾）

在将一种音乐语言翻译成另一种音乐语言（ABC 到 Alda）作为学习 Raku DSL 能力的借口的过程中，我注意到似乎没有办法终止.parse! 这是我缩短的演示代码：

#!/home/hsmyers/rakudo741/bin/perl6
use v6d;

# use Grammar::Debugger;
use Grammar::Tracer;

my $test-n01 = q:to/EOS/;
a b c d e f g
A B C D E F G
EOS

grammar test {
  token TOP { <score>+ }
  token score {
      <.ws>?
      [
          | <uc>
          | <lc>
      ]+
      <.ws>?
  }
  token uc { <[A..G]> }
  token lc { <[a..g]> }
}

test.parse($test-n01).say;

Run Code Online (Sandbox Code Playgroud)

Grammer::Tracer 显示的最后一部分说明了我的问题。

|  score
|  |  uc
|  |  * MATCH "G"
|  * MATCH "G\n"
|  score
|  * FAIL
* MATCH "a b c d e f g\nA B C D E F G\n"
?a b c d e f g
A B C D E F G
?

Run Code Online (Sandbox Code Playgroud)

在倒数第二行，单词 FAIL 告诉我 .parse 运行无法退出。我想知道这是否正确？.say 按原样显示所有内容，所以我不清楚 FAIL 的真实性如何？问题仍然存在，“我如何正确编写一个解析多行而没有错误的语法？”

Answer 1

use*_*601 10

当您使用语法调试器时，它可以让您准确地看到引擎是如何解析字符串的——失败是正常的，也是意料之中的。例如，考虑a+b*与字符串匹配aab。您应该为 'a' 获得两个匹配项，然后是失败（因为b不是a），但随后它将重试b并成功匹配。

如果您与||（强制执行顺序）进行交替，则可能更容易看到这一点。如果你有

token TOP   { I have a <fruit> }
token fruit { apple || orange || kiwi }

Run Code Online (Sandbox Code Playgroud)

并且您解析句子“I have a kiwi”，您会看到它首先匹配“I have a”，然后是“apple”和“orange”两次失败，最后匹配“kiwi”。

现在让我们看看你的情况：

TOP                  # Trying to match top (need >1 match of score)
|  score             #   Trying to match score (need >1 match of lc/uc)
|  |  lc             #     Trying to match lc
|  |  * MATCH "a"    #     lc had a successful match! ("a")
|  * MATCH "a "      #   and as a result so did score! ("a ")
|  score             #   Trying to match score again (because <score>+)
|  |  lc             #     Trying to match lc 
|  |  * MATCH "b"    #     lc had a successful match! ("b")
|  * MATCH "b "      #   and as a result so did score! ("b ")
……………                #     …so forth and so on until…
|  score             #   Trying to match score again (because <score>+)
|  |  uc             #     Trying to match uc
|  |  * MATCH "G"    #     uc had a successful match! ("G")
|  * MATCH "G\n"     #   and as a result, so did score! ("G\n")
|  score             #   Trying to match *score* again (because <score>+)
|  * FAIL            #   failed to match score, because no lc/uc.
|
|  # <--------------   At this point, the question is, did TOP match?
|  #                     Remember, TOP is <score>+, so we match TOP if there 
|  #                     was at least one <score> token that matched, there was so...
|
* MATCH "a b c d e f g\nA B C D E F G\n" # this is the TOP match

Run Code Online (Sandbox Code Playgroud)

这里的失败是正常的：在某些时候我们会用完<score>代币，所以失败是不可避免的。发生这种情况时，语法引擎可以继续处理<score>+语法中之后的任何内容。由于什么都没有，失败实际上会导致整个字符串的TOP匹配（因为与隐式匹配/^…$/）。

此外，您可能会考虑使用自动插入 <.ws>* 的规则重写语法（除非它只是一个空格很重要）：

grammar test {
  rule TOP { <score>+ }
  token score {
      [
          | <uc>
          | <lc>
      ]+
  }
  token uc { <[A..G]> }
  token lc { <[a..g]> }
}

Run Code Online (Sandbox Code Playgroud)

此外，IME，您可能还想为 uc/lc 添加一个 proto 令牌，因为当您拥有时，[ <foo> | <bar> ]您将始终未定义其中一个令牌，这会使在操作类中处理它们有点烦人。你可以试试：

grammar test {
  rule  TOP   { <score>  + }
  token score { <letter> + }

  proto token letter    {     *    }
        token letter:uc { <[A..G]> }
        token letter:lc { <[a..g]> }
}

Run Code Online (Sandbox Code Playgroud)

$<letter> 将始终以这种方式定义。

归档时间：	5 年，11 月前
查看次数：	227 次
最近记录：	5 年，11 月前