我正在努力解决有关正则表达式的一个小问题.
我想将特定字符的所有奇数长度子串替换为具有相同长度但具有不同字符的另一个子串.所有指定字符的偶数序列应保持不变.
简化示例:字符串包含字母a,b和y,y的所有奇数长度序列应替换为z:
abyyyab -> abzzzab
Run Code Online (Sandbox Code Playgroud)
另一个可能的例子可能是:
ycyayybybcyyyyycyybyyyyyyy
Run Code Online (Sandbox Code Playgroud)
变
zczayybzbczzzzzcyybzzzzzzz
Run Code Online (Sandbox Code Playgroud)
使用正则表达式匹配奇数长度的所有序列没有问题.
不幸的是,我不知道如何将这些匹配的长度信息合并到替换字符串中.我知道我必须以某种方式使用反向引用/捕获组,但即使在阅读了大量文档和Stack Overflow文章后,我仍然不知道如何正确地解决问题.
关于可能的正则表达式引擎,我主要与Emacs或Vim合作.
如果我忽略了一个更简单的通用解决方案而没有复杂的正则表达式(例如,一小组固定的简单搜索和替换命令),这也会有所帮助.
这是我在vim中的表现:
:s/\vy@<!y(yy)*y@!/\=repeat('z', len(submatch(0)))/g
Run Code Online (Sandbox Code Playgroud)
说明:
我们正在使用的正则表达式是\vy@<!y(yy)*y@!.在\v一开始打开的magic选项,所以我们没有逃避之多.没有它,我们会有y\@<!y\(yy\)*y\@!.
本次搜索的基本思想,是我们正在寻找一个"Y" y,然后对"的y的运行,(yy)*.然后我们添加y@<!以保证在我们的比赛之前没有'y' ,并且添加y\@!以保证在我们的比赛之后没有'y' .
然后我们使用eval寄存器替换它,即\=.来自:h sub-replace-\=:
*sub-replace-\=* *s/\=*
When the substitute string starts with "\=" the remainder is interpreted as an
expression.
The special meaning for characters as mentioned at |sub-replace-special| does
not apply except for "<CR>". A <NL> character is used as a line break, you
can get one with a double-quote string: "\n". Prepend a backslash to get a
real <NL> character (which will be a NUL in the file).
The "\=" notation can also be used inside the third argument {sub} of
|substitute()| function. In this case, the special meaning for characters as
mentioned at |sub-replace-special| does not apply at all. Especially, <CR> and
<NL> are interpreted not as a line break but as a carriage-return and a
new-line respectively.
When the result is a |List| then the items are joined with separating line
breaks. Thus each item becomes a line, except that they can contain line
breaks themselves.
The whole matched text can be accessed with "submatch(0)". The text matched
with the first pair of () with "submatch(1)". Likewise for further
sub-matches in ().
Run Code Online (Sandbox Code Playgroud)
TL; DR,用评估为vimscript代码:s/foo/\=blah替换fooblah.所以我们正在评估的代码就是repeat('z', len(submatch(0)))我们匹配的每个'y'的'z'.