为什么这个正则表达式在sed中的运行方式与Perl/Ruby不同？

Question

为什么这个正则表达式在sed中的运行方式与Perl/Ruby不同？

我有一个正则表达式给我一个结果,sed但在Perl(和Ruby)中有另一个结果.

我有字符串one;two;;three,我想突出显示由;.分隔的子字符串.所以我在Perl中执行以下操作:

$a = "one;two;;three";
$a =~ s/([^;]*)/[\1]/g;
print $a;

Run Code Online (Sandbox Code Playgroud)

(或者,在Ruby中:print "one;two;;three".gsub(/([^;]*)/, "[\\1]").)

结果是:

[one][];[two][];[];[three][]

Run Code Online (Sandbox Code Playgroud)

(我知道虚假空子串的原因.)

奇怪的是,当我运行相同的正则表达式时,sed我得到了不同的结果.我跑:

echo "one;two;;three" | sed -e 's/[^;]*/[\0]/g'

Run Code Online (Sandbox Code Playgroud)

我得到:

[one];[two];[];[three]

Run Code Online (Sandbox Code Playgroud)

造成这种不同结果的原因是什么？

编辑:

有人回答"因为sed不是perl".我知道.我问我的问题的原因是因为我不明白如何sed应对零长度匹配.

Answer 1

tih*_*hom 2

sed-4.2从替代函数的源代码来看：

   /sed/execute.c
  /* If we're counting up to the Nth match, are we there yet?
     And even if we are there, there is another case we have to
 skip: are we matching an empty string immediately following
     another match?

     This latter case avoids that baaaac, when passed through
     s,a*,x,g, gives `xbxxcx' instead of xbxcx.  This behavior is
     unacceptable because it is not consistently applied (for
     example, `baaaa' gives `xbx', not `xbxx'). */

Run Code Online (Sandbox Code Playgroud)

这表明我们在 Ruby 和 Perl 中看到的行为在sed. 这并不是由于语言之间有任何根本差异，而是由于语言中的特殊处理而导致的。sed

归档时间：	11 年，10 月前
查看次数：	887 次
最近记录：	11 年，10 月前