Nat*_*enn 7 regex recursion perl
我正在尝试匹配文本sp { ...{...}... },如允许花括号嵌套.这是我到目前为止:
my $regex = qr/
( #save $1
sp\s+ #start Soar production
( #save $2
\{ #opening brace
[^{}]* #anything but braces
\} #closing brace
| (?1) #or nested braces
)+ #0 or more
)
/x;
Run Code Online (Sandbox Code Playgroud)
我无法让它与以下文字相符:sp { { word } }.任何人都可以看到我的正则表达式有什么问题吗?
有很多问题.递归位应该是:
(
(?: \{ (?-1) \}
| [^{}]+
)*
)
Run Code Online (Sandbox Code Playgroud)
全部一起:
my $regex = qr/
sp\s+
\{
(
(?: \{ (?-1) \}
| [^{}]++
)*
)
\}
/x;
print "$1\n" if 'sp { { word } }' =~ /($regex)/;
Run Code Online (Sandbox Code Playgroud)
对于未充分利用的Text::Balanced,这种情况非常方便的核心模块就是这种情况.它确实依赖于pos首先找到/设置的分隔序列的开头,所以我通常会像这样调用它:
#!/usr/bin/env perl
use strict;
use warnings;
use Text::Balanced 'extract_bracketed';
sub get_bracketed {
my $str = shift;
# seek to beginning of bracket
return undef unless $str =~ /(sp\s+)(?={)/gc;
# store the prefix
my $prefix = $1;
# get everything from the start brace to the matching end brace
my ($bracketed) = extract_bracketed( $str, '{}');
# no closing brace found
return undef unless $bracketed;
# return the whole match
return $prefix . $bracketed;
}
my $str = 'sp { { word } }';
print get_bracketed $str;
Run Code Online (Sandbox Code Playgroud)
带有gc修饰符的正则表达式告诉字符串记住匹配的结束点,并extract_bracketed使用该信息知道从哪里开始.