std*_*nsw 5 regex perl grep negative-lookbehind
(注意:不是为什么不能在断言后面的零宽度外观中使用重复量词的副本;请参阅文章结尾。)
我正在尝试编写一个grep -P与 B 匹配的(Perl)正则表达式,当它前面没有 A 时——不管是否有中间空格。
所以,我尝试了这种负面的回顾,并在 regex101.com 中进行了测试:
(?<!A)\s*B
Run Code Online (Sandbox Code Playgroud)
这导致“AB”不匹配,这很好,但“A B”确实导致匹配,这不是我想要的。
我不确定这是为什么。它与 \s* 匹配空字符串 "" 的事实有关,因此您可以说 A 和 B 之间存在 \s* 无穷大匹配。但是为什么这会影响 "AB" 但不是“AB”?
以下正则表达式是否是正确的解决方案,如果是,为什么它可以解决问题?
(?<![A\s])\s*B
Run Code Online (Sandbox Code Playgroud)
我之前发布过这个,它被错误地标记为重复的问题。我正在寻找的可变长度的东西是匹配的一部分,而不是负面回顾本身的一部分——所以这与其他问题完全不同。是的,我可以将 \s* 放在负回顾中,但我没有这样做(并且不支持这样做,正如另一个问题所解释的那样)。另外,我对我上面发布的替代正则表达式为什么有效特别感兴趣,因为我知道它有效,但我不确定为什么。另一个问题并没有帮助回答这个问题。
但是为什么这会影响“A B”而不影响“AB”?
Regexes match at a position, which it is helpful to think of as being between characters. In "A B" there is a position (after the space and before the B) where (?<!A) succeeds (because there isn't an A immediately preceding; there's a space instead), and \s*B succeeds (\s* matches the empty string, and B matches B), so the entire pattern succeeds.
In "AB" there is no such position. The only place where \s*B can match (immediately before the B), is also immediately after the A, so (?<!A) cannot succeed. There are no positions that satisfy both, so the pattern as a whole can't succeed.
Is the following regex a proper solution, and if so, why exactly does it fix the problem?
(?<![A\s])\s*B
This works because (?<![A\s]) will not succeed immediately after an A or after a space. So now the lookbehind forbids any match position that has spaces before it. If there are any spaces before the B, they have to be consumed by the \s* portion of the pattern, and the match position must be before them. If that position also doesn't have an A before it, the lookbehind can succeed and the pattern as a whole can match.
This is a trick that's made possible by the fact that \s is a fixed-width pattern that matches at every position inside of a non-empty \s* match. It can't be extended to the general case of any pattern between the (non-)A and the B.
| 归档时间: |
|
| 查看次数: |
2628 次 |
| 最近记录: |