Mul*_*ave 3 php regex parsing text-parsing regex-lookarounds
给定一个虚函数:
public function handle()
{
if (isset($input['data']) {
switch($data) {
...
}
} else {
switch($data) {
...
}
}
}
Run Code Online (Sandbox Code Playgroud)
我的目的是获取该函数的内容,问题是匹配花括号的嵌套模式{...}.
我遇到过递归模式,但无法理解与函数体相匹配的正则表达式.
我尝试了以下(没有递归):
$pattern = "/function\shandle\([a-zA-Z0-9_\$\s,]+\)?". // match "function handle(...)"
'[\n\s]?[\t\s]*'. // regardless of the indentation preceding the {
'{([^{}]*)}/'; // find everything within braces.
preg_match($pattern, $contents, $match);
Run Code Online (Sandbox Code Playgroud)
这种模式根本不匹配.我确信这是最后一点是错误的,'{([^{}]*)}/'因为当身体内没有其他支撑时,该模式有效.
将其替换为:
'{([^}]*)}/';
Run Code Online (Sandbox Code Playgroud)
它匹配到语句}中的开关关闭if并停在那里(包括}开关但不包括开关if).
除了这种模式,同样的结果:
'{(\K[^}]*(?=)})/m';
Run Code Online (Sandbox Code Playgroud)
根据其他人的评论
^\s*[\w\s]+\(.*\)\s*\K({((?>"(?:[^"\\]*+|\\.)*"|'(?:[^'\\]*+|\\.)*'|//.*$|/\*[\s\S]*?\*/|#.*$|<<<\s*["']?(\w+)["']?[^;]+\3;$|[^{}<'"/#]++|[^{}]++|(?1))*)})
Run Code Online (Sandbox Code Playgroud)
注意:{((?>[^{}]++|(?R))*)}如果您知道输入不包含{或不包含}PHP语法,那么简短的RegEx 就足够了.
[{}]一个引号之间的字符串["'][{}]一个评论块.//...或/*...*/或#...[{}]heredoc或nowdoc <<<STR或<<<['"]STR['"]否则它意味着有一对打开/关闭支撑和嵌套支撑的深度并不重要.
不,除非你的代码内有火星.
^ \s* [\w\s]+ \( .* \) \s* \K # how it matches a function definition
( # (1 start)
{ # opening brace
( # (2 start)
(?> # atomic grouping (for its non-capturing purpose only)
"(?: [^"\\]*+ | \\ . )*" # double quoted strings
| '(?: [^'\\]*+ | \\ . )*' # single quoted strings
| // .* $ # a comment block starting with //
| /\* [\s\S]*? \*/ # a multi line comment block /*...*/
| \# .* $ # a single line comment block starting with #...
| <<< \s* ["']? # heredocs and nowdocs
( \w+ ) # (3) ^
["']? [^;]+ \3 ; $ # ^
| [^{}<'"/#]++ # force engine to backtack if it encounters special characters [<'"/#] (possessive)
| [^{}]++ # default matching bahaviour (possessive)
| (?1) # recurse 1st capturing group
)* # zero to many times of atomic group
) # (2 end)
} # closing brace
) # (1 end)
Run Code Online (Sandbox Code Playgroud)
格式化由@ sln的RegexFormatter软件完成.
随机给出Laravel的Eloquent Model.php文件(~3500行)作为输入.看看: 现场演示