在正则表达式中使用代码块时“不一致”的匹配结果 [Raku]

Question

在正则表达式中使用代码块时“不一致”的匹配结果 [Raku]

在检查和测试正则表达式的各个方面时，我偶然发现了一种奇怪且“不一致”的行为。我试图在正则表达式中使用一些代码，但同样的行为也适用于使用 void 代码块。特别是最让我感动的是，当我互换 :g 与 :x 修饰符时，匹配结果的差异。

以下代码片段描述了“不一致”的行为。

首先没有代码块：

use v6.d;

if "test1 test2 test3 test4" ~~ m:g/ (\w+) / {
    say ~$_ for $/.list;
}

Run Code Online (Sandbox Code Playgroud)

结果：

test1
test2
test3
test4

Run Code Online (Sandbox Code Playgroud)

然后使用 :g 修饰符和代码块：

use v6.d;

if "test1 test2 test3 test4" ~~ m:g/ (\w+) {} / {
    say ~$_ for $/.list;
}

Run Code Online (Sandbox Code Playgroud)

结果：

test4

Run Code Online (Sandbox Code Playgroud)

最后使用 :x 修饰符和代码块

use v6.d;

if "test1 test2 test3 test4" ~~ m:x(4)/ (\w+) {} / {
    say ~$_ for $/.list;
}

Run Code Online (Sandbox Code Playgroud)

结果：

test1
test2
test3
test4

Run Code Online (Sandbox Code Playgroud)

我预计这三个结果是相同的，但我感到非常惊讶。

对这种行为有什么解释吗？

Answer 1

rai*_*iph 5

TL; DR 问题通过@jakar日提交并固定由jnthn。

（经过更多测试和代码探索后重写。）

这在我（也可能是你）看来就像一个错误。$/在使用:g和嵌入块时不知何故变得 kiboshed 。

这个答案涵盖：

聚焦问题
查看编译器的源代码
搜索问题队列和/或提交新问题

聚焦问题

my &debug = {;} # start off doing no debugging
$_ = 'aa';

say       m      / {debug 1} 'a' {debug 2} /; debug 3; # ?a?
say $/ if m      / {debug 1} 'a' {debug 2} /; debug 3; # ?a?

say       m:x(2) / {debug 1} 'a' {debug 2} /; debug 3; # (?a? ?a?)
say $/ if m:x(2) / {debug 1} 'a' {debug 2} /; debug 3; # (?a? ?a?)

say       m:g    / {debug 1} 'a' {debug 2} /; debug 3; # (?a? ?a?)
say $/ if m:g    / {debug 1} 'a' {debug 2} /; debug 3; # ?a? <--- Uhoh

Run Code Online (Sandbox Code Playgroud)

现在debug说一些有用的东西并运行第一对（没有正则表达式副词）：

&debug = { say $_, $/.WHICH } # Say location of object bound to `$/`

say       m      / {debug 1} 'a' {debug 2} /; debug 3; # ?a?
# 1Match|66118928
# 2Match|66118928
# ?a?
# 3Match|66118928

say $/ if m      / {debug 1} 'a' {debug 2} /; debug 3; # ?a?
# 1Match|66119072
# 2Match|66119072
# ?a?
# 3Match|66119072

Run Code Online (Sandbox Code Playgroud)

两种情况下的结果相同。匹配过程创建一个Match对象并坚持使用相同的对象。

现在:x(2)副词的两个变体：

say       m:x(2) / {debug 1} 'a' {debug 2} /; debug 3; # (?a? ?a?)
# 1Match|66119936
# 2Match|66119936
# 1Match|66120080
# 2Match|66120080
# 1Match|66120224
# (?a? ?a?)
# 3List|67612624

say $/ if m:x(2) / {debug 1} 'a' {debug 2} /; debug 3; # (?a? ?a?)
# 1Match|66120368
# 2Match|66120368
# 1Match|66120512
# 2Match|66120512
# 1Match|66120656
# (?a? ?a?)
# 3List|67612672

Run Code Online (Sandbox Code Playgroud)

这次匹配过程创建了一个Match对象并坚持使用它进行一次传递，然后是第二次传递的第二个匹配对象，最后是第三次传递的第三个匹配对象，然后它无法匹配第三次'a'（因此相应的debug 2不匹配） t 被调用）。在结束m.../.../通话它创造了一个List对象并绑定该来$/。

接下来我们运行两种:g情况中的第一种：

say       m:g    / {debug 1} 'a' {debug 2} /; debug 3; # (?a? ?a?)
# 1Match|66119216
# 2Match|66119216
# 1Match|66119360
# 2Match|66119360
# 1Match|66119504
# (?a? ?a?)
# 3Match|66119504

Run Code Online (Sandbox Code Playgroud)

像这种x:(2)情况，我们第三次尝试并失败了。但比赛过程并没有返回List，而是一个Match对象。它是在第三遍中创建的那个。（这让我很惊讶。）

最后，还有“呃”的情况：

say $/ if m:g    / {debug 1} 'a' {debug 2} /; debug 3; # ?a? <--- Uhoh
# 1Match|66119648
# 2Match|66119648
# 1Match|66119792
# 2Match|66119792
# ?a?
# 3Match|66119792

Run Code Online (Sandbox Code Playgroud)

值得注意的是，预期的第三次传球似乎没有开始。

查看编译器的源代码

探索相关的源代码很有价值是有道理的。如果您或其他读者感兴趣，并且如果这是一个错误并且我写的内容对修复它的人感兴趣，我会在这里写下它。

Afaict 正则表达式中的代码块导致此处生成一个 AST 节点，该节点在执行绑定操作的块中的语句之前插入一个子节点：

                    :op('bind'),

                    QAST::Var.new( :name('$/'), :scope('lexical') ),

                    QAST::Op.new(
                        QAST::Var.new( :name('$¢'), :scope('lexical') ),
                        :name('MATCH'),
                        :op('callmethod')
                    )

Run Code Online (Sandbox Code Playgroud)

我对上述内容的理解是，它会在块中运行代码之前，立即插入将词法$/符号.MATCH绑定到绑定到词法$¢符号的对象上的方法调用结果的代码。

该文档有一节关于$¢; 我引用一句话：

$/和之间的主要区别$¢是范围：后者仅在 [a] 正则表达式中具有值

我想知道为什么$¢存在以及还有哪些其他差异。

继续...

我看到有一个 raku 级别.MATCH。但它几乎没有任何作用。所以我认为相关的代码在这里。

在这一点上，我会暂停。我可能会在以后的编辑中继续进一步。

搜索问题队列和/或提交新问题

如果有人在接下来的几天内提出了一个答案，证明您所展示的内容不是错误，或者已经作为错误提交，那么就足够公平了。

否则，请考虑自己搜索问题队列和/或在您认为最合适的任何问题队列中开始一个新问题（默认为 /rakudo/rakudo/issues）。

作为撰写此答案的一部分，我已经搜索了我认为可能相关的四个 github.com 问题队列：

我搜索了两个关键字，希望可以发现现有问题（“全局”和“发布”）。没有匹配问题是相关的。也许您还可以查找您认为文件管理器可能会使用的其他关键字。

如果您确实提出了问题，请考虑添加您的测试，或我的或其他一些变体，转换为标准烘焙测试用例，如果您知道如何做的话。

归档时间：	5 年，8 月前
查看次数：	124 次
最近记录：	5 年，5 月前