zx8*_*x81 20
介绍
在输入的底部添加整数列表的想法类似于着名的数据库hack(与正则表达式无关),其中一个连接到整数表.我的原始答案使用了@Qtax技巧.当前的答案使用递归,Qtax技巧(直接或反向变化)或平衡组.
是的,有可能......有一些警告和正则表达法.
:1:2:3:4:5:6:7这是一个类似的技术,使用一个使用整数表的着名数据库黑客.输入文件:
假设我们正在搜索pig并希望用行号替换它.
我们将此作为输入:
my cat
dog
my pig
my cow
my mouse
:1:2:3:4:5:6:7
Run Code Online (Sandbox Code Playgroud)
支持的语言:除了上面提到的文本编辑器(Notepad ++和EditPad Pro)之外,这个解决方案应该使用PCRE(PHP,R,Delphi),Perl和使用Matthew Barnett regex模块(未经测试)的Python语言.
递归结构存在于前瞻中,并且是可选的.它的工作是平衡pig左侧不包含数字的线条:右侧:将其视为平衡嵌套构造,如{{{ }}}...除了在左边我们有无匹配线,然后在我们拥有数字的权利.关键是当我们退出前瞻时,我们知道跳过了多少行.
搜索:
(?sm)(?=.*?pig)(?=((?:^(?:(?!pig)[^\r\n])*(?:\r?\n))(?:(?1)|[^:]+)(:\d+))?).*?\Kpig(?=.*?(?(2)\2):(\d+))
Run Code Online (Sandbox Code Playgroud)
带注释的自由间距版本:
(?xsm) # free-spacing mode, multi-line
(?=.*?pig) # fail right away if pig isn't there
(?= # The Recursive Structure Lives In This Lookahead
( # Group 1
(?: # skip one line
^
(?:(?!pig)[^\r\n])* # zero or more chars not followed by pig
(?:\r?\n) # newline chars
)
(?:(?1)|[^:]+) # recurse Group 1 OR match all chars that are not a :
(:\d+) # match digits
)? # End Group
) # End lookahead.
.*?\Kpig # get to pig
(?=.*?(?(2)\2):(\d+)) # Lookahead: capture the next digits
Run Code Online (Sandbox Code Playgroud)
更换: \3
在演示中,请参阅底部的替换.您可以使用前两行中的字母(删除要制作的空格pig)来将第一次出现的位置移动pig到另一行,并查看它对结果的影响.
支持的语言:除了上面提到的文本编辑器(Notepad ++和EditPad Pro)之外,这个解决方案应该使用PCRE(PHP,R,Delphi),Perl和使用Matthew Barnett regex模块(未经测试)的Python语言.通过将\K前瞻和占有量词转换为原子组,该解决方案很容易适应.NET (请参阅下面几行的.NET版本.)
搜索:
(?sm)(?=.*?pig)(?:(?:^(?:(?!pig)[^\r\n])*(?:\r?\n))(?=[^:]+((?(1)\1):\d+)))*+.*?\Kpig(?=[^:]+(?(1)\1):(\d+))
Run Code Online (Sandbox Code Playgroud)
.NET版本:回到未来
.NET没有\K.它的位置,我们使用"回到未来"的背后(一个包含在比赛前跳过的前瞻的后视).此外,我们需要使用原子组而不是占有量词.
(?sm)(?<=(?=.*?pig)(?=(?>(?:^(?:(?!pig)[^\r\n])*(?:\r?\n))(?=[^:]+((?(1)\1):\d+)))*).*)pig(?=[^:]+(?(1)\1):(\d+))
Run Code Online (Sandbox Code Playgroud)
带注释的自由间距版本(Perl/PCRE版本):
(?xsm) # free-spacing mode, multi-line
(?=.*?pig) # lookahead: if pig is not there, fail right away to save the effort
(?: # start counter-line-skipper (lines that don't include pig)
(?: # skip one line
^ #
(?:(?!pig)[^\r\n])* # zero or more chars not followed by pig
(?:\r?\n) # newline chars
)
# for each line skipped, let Group 1 match an ever increasing portion of the numbers string at the bottom
(?= # lookahead
[^:]+ # skip all chars that are not colons
( # start Group 1
(?(1)\1) # match Group 1 if set
:\d+ # match a colon and some digits
) # end Group 1
) # end lookahead
)*+ # end counter-line-skipper: zero or more times
.*? # match
\K # drop everything we've matched so far
pig # match pig (this is the match!)
(?=[^:]+(?(1)\1):(\d+)) # capture the next number to Group 2
Run Code Online (Sandbox Code Playgroud)
更换:
\2
Run Code Online (Sandbox Code Playgroud)
输出:
my cat
dog
my 3
my cow
my mouse
:1:2:3:4:5:6:7
Run Code Online (Sandbox Code Playgroud)
在演示中,请参阅底部的替换.您可以使用前两行中的字母(删除要制作的空格pig)来将第一次出现的位置移动pig到另一行,并查看它对结果的影响.
数字分隔符的选择
在我们的示例中,:数字串的分隔符相当常见,可能发生在其他地方.我们可以发明一点UNIQUE_DELIMITER并略微调整表达式.但是,以下优化更有效,让我们保持:
而不是按顺序粘贴我们的数字,以相反的顺序使用它们可能对我们有利: :7:6:5:4:3:2:1
在我们的前瞻中,这允许我们通过简单的方式深入到输入的底部.*,并从那里开始回溯.因为我们知道我们在字符串的末尾,所以我们不必担心它:digits是字符串另一部分的一部分.这是怎么做的.
输入:
my cat pi g
dog p ig
my pig
my cow
my mouse
:7:6:5:4:3:2:1
Run Code Online (Sandbox Code Playgroud)
搜索:
(?xsm) # free-spacing mode, multi-line
(?=.*?pig) # lookahead: if pig is not there, fail right away to save the effort
(?: # start counter-line-skipper (lines that don't include pig)
(?: # skip one line that doesn't have pig
^ #
(?:(?!pig)[^\r\n])* # zero or more chars not followed by pig
(?:\r?\n) # newline chars
)
# Group 1 matches increasing portion of the numbers string at the bottom
(?= # lookahead
.* # get to the end of the input
( # start Group 1
:\d+ # match a colon and some digits
(?(1)\1) # match Group 1 if set
) # end Group 1
) # end lookahead
)*+ # end counter-line-skipper: zero or more times
.*? # match
\K # drop match so far
pig # match pig (this is the match!)
(?=.*(\d+)(?(1)\1)) # capture the next number to Group 2
Run Code Online (Sandbox Code Playgroud)
更换: \2
请参阅演示中的替换.
此解决方案特定于.NET.
搜索:
(?m)(?<=\A(?<c>^(?:(?!pig)[^\r\n])*(?:\r?\n))*.*?)pig(?=[^:]+(?(c)(?<-c>:\d+)*):(\d+))
Run Code Online (Sandbox Code Playgroud)
带注释的自由间距版本:
(?xm) # free-spacing, multi-line
(?<= # lookbehind
\A #
(?<c> # skip one line that doesn't have pig
# The length of Group c Captures will serve as a counter
^ # beginning of line
(?:(?!pig)[^\r\n])* # zero or more chars not followed by pig
(?:\r?\n) # newline chars
) # end skipper
* # repeat skipper
.*? # we're on the pig line: lazily match chars before pig
) # end lookbehind
pig # match pig: this is the match
(?= # lookahead
[^:]+ # get to the digits
(?(c) # if Group c has been set
(?<-c>:\d+) # decrement c while we match a group of digits
* # repeat: this will only repeat as long as the length of Group c captures > 0
) # end if Group c has been set
:(\d+) # Match the next digit group, capture the digits
) # end lokahead
Run Code Online (Sandbox Code Playgroud)
更换: $1
我不知道有哪个编辑器能够如此短地扩展允许任意扩展的编辑器。
不过,您可以轻松地用来perl完成该任务。
perl -i.bak -e"s/word/$./eg" file
Run Code Online (Sandbox Code Playgroud)
或者如果你想使用通配符,
perl -MFile::DosGlob=glob -i.bak -e"BEGIN { @ARGV = map glob($_), @ARGV } s/word/$./eg" *.txt
Run Code Online (Sandbox Code Playgroud)