复杂的前瞻/不贪婪的正则表达式

Question

复杂的前瞻/不贪婪的正则表达式

我有以下字符串:

(这一切都在一条线上)

<IMG SRC="/include/images/moredetails.png" WIDTH="8" HEIGHT="7" ONMOUSEOVER="return createPopup('<b>[scan_name@user:home]:</b> <!-- #EscapedName# --><br><b>[organization@user:home]:</b><br><!-- #EscapedOrganizationPath# --><br><b>[total@user:home]:</b> <!-- #EscapedTotal# --><br><b>[high@user:home]:</b> <!-- #EscapedHigh# --><br><b>[medium@user:home]:</b> <!-- #EscapedMedium# --><br><b>[low@user:home]:</b> <!-- #EscapedLow# --><br><b>[date_last_scanned@user:home]:</b> <!-- #EscapedDate# -->');" ONMOUSEOUT="return nd(1000);"><!-- #Name# --></TD>

Run Code Online (Sandbox Code Playgroud)

第二个字符串:

<IMG SRC="/include/images/moredetails.png" WIDTH="8" HEIGHT="7" ONMOUSEOVER="return createPopup('<b>[scan_name@user:home]:</b> <!-- #EscapedName# --><br><b>??/?? ??:</b><br><!-- #EscapedOrganizationPath# --><br><b>[total@user:home]:</b> <!-- #EscapedTotal# --><br><b>[high@user:home]:</b> <!-- #EscapedHigh# --><br><b>[medium@user:home]:</b> <!-- #EscapedMedium# --><br><b>[low@user:home]:</b> <!-- #EscapedLow# --><br><b>[date_last_scanned@user:home]:</b> <!-- #EscapedDate# -->');" ONMOUSEOUT="return nd(1000);"><!-- #Name# --></TD>

Run Code Online (Sandbox Code Playgroud)

我想从第一个字符串找到所有[..]占位符,并在第二个字符串中找到他们的韩语翻译.

我编写的代码执行此操作:

while($stringA =~ /(.*?)(\[[^\]]+?\])(.*?)/g) {
 my $prefix = $1;
 my $tag = $2;
 my $suffix = $3;

Run Code Online (Sandbox Code Playgroud)

然后在$prefix和上调用正则表达式$suffix:

if ($stringB =~ /\Q$prefix\E(.*)\Q$suffix\E/g) {

Run Code Online (Sandbox Code Playgroud)

注意以下复制的示例不会逃脱",我只是这样做以使其更清晰

问题:

A. $prefix并且$suffix不包含该占位符之前和之后的所有内容,因为我使用的是非贪婪的.例:

$prefix = "<IMG SRC="/include/images/moredetails.png" WIDTH="8" HEIGHT="7" ONMOUSEOVER="return createPopup('<b>"
$tag = "[scan_name@user:home]"
$suffix = ""

Run Code Online (Sandbox Code Playgroud)

B.如果我不使用贪婪,(.*)(\[[^\]]+?\])(.*)我会"正确"捕获所有内容,但只会捕获最后一个标记.例:

$prefix = "<IMG SRC="/include/images/moredetails.png" WIDTH="8" HEIGHT="7" ONMOUSEOVER="return createPopup('<b>[scan_name@user:home]:</b> <!-- #EscapedName# --><br><b>[organization@user:home]:</b><br><!-- #EscapedOrganizationPath# --><br><b>[total@user:home]:</b> <!-- #EscapedTotal# --><br><b>[high@user:home]:</b> <!-- #EscapedHigh# --><br><b>[medium@user:home]:</b> <!-- #EscapedMedium# --><br><b>[low@user:home]:</b> <!-- #EscapedLow# --><br><b>"
$tag = "[date_last_scanned@user:home]"
$suffix = ":</b> <!-- #EscapedDate# -->');" ONMOUSEOUT="return nd(1000);"><!-- #Name# --></TD>"

Run Code Online (Sandbox Code Playgroud)

我想要的是

我想捕获所有标签,并能够将其与翻译的字符串进行比较并返回如下内容:

'[state@user:home] = '??'

Run Code Online (Sandbox Code Playgroud)

谢谢您的帮助

Answer 1

Tot*_*oto 2

怎么样：

\n\n

my $strA = q~<IMG SRC="/include/images/moredetails.png" WIDTH="8" HEIGHT="7" ONMOUSEOVER="return createPopup(\'<b>[scan_name@user:home]:</b> <!-- #EscapedName# --><br><b>[organization@user:home]:</b><br><!-- #EscapedOrganizationPath# --><br><b>[total@user:home]:</b> <!-- #EscapedTotal# --><br><b>[high@user:home]:</b> <!-- #EscapedHigh# --><br><b>[medium@user:home]:</b> <!-- #EscapedMedium# --><br><b>[low@user:home]:</b> <!-- #EscapedLow# --><br><b>[date_last_scanned@user:home]:</b> <!-- #EscapedDate# -->\');" ONMOUSEOUT="return nd(1000);"><!-- #Name# --></TD>~;\nmy $strB = q~<IMG SRC="/include/images/moredetails.png" WIDTH="8" HEIGHT="7" ONMOUSEOVER="return createPopup(\'<b>[scan_name@user:home]:</b> <!-- #EscapedName# --><br><b>\xec\xa1\xb0\xec\xa7\x81/\xeb\xb6\x80\xec\x84\x9c \xea\xb2\xbd\xeb\xa1\x9c:</b><br><!-- #EscapedOrganizationPath# --><br><b>[total@user:home]:</b> <!-- #EscapedTotal# --><br><b>[high@user:home]:</b> <!-- #EscapedHigh# --><br><b>[medium@user:home]:</b> <!-- #EscapedMedium# --><br><b>[low@user:home]:</b> <!-- #EscapedLow# --><br><b>[date_last_scanned@user:home]:</b> <!-- #EscapedDate# -->\');" ONMOUSEOUT="return nd(1000);"><!-- #Name# --></TD>~;\n\nwhile($strA =~ /(.*?)\\[([^\\]]+?)\\](.)/g) {\n    my $prefix = $1;\n    my $tag = $2;\n    my $suffix = $3;\n    print "prefix=$prefix\\ntag=$tag\\nsuffix=$suffix\\n";\n    print "found it $1\\n\\n" if ($strB =~ /\\Q$prefix\\E\\[?([^\\[\\]]+)\\]?\\Q$suffix\\E/g);\n}\n

Run Code Online (Sandbox Code Playgroud)\n\n

如果您想要更长的后缀以避免重叠，可以使用：

\n\n

while($strA =~ /(.*?)\\[([^\\]]+?)\\]([^[]*))/g) {\n

Run Code Online (Sandbox Code Playgroud)\n\n

输出：

\n\n

prefix=<IMG SRC="/include/images/moredetails.png" WIDTH="8" HEIGHT="7" ONMOUSEOVER="return createPopup(\'<b>\ntag=scan_name@user:home\nsuffix=:\nfound it scan_name@user:home\n\nprefix=</b> <!-- #EscapedName# --><br><b>\ntag=organization@user:home\nsuffix=:\nfound it \xec\xa1\xb0\xec\xa7\x81/\xeb\xb6\x80\xec\x84\x9c \xea\xb2\xbd\xeb\xa1\x9c\n\nprefix=</b><br><!-- #EscapedOrganizationPath# --><br><b>\ntag=total@user:home\nsuffix=:\nfound it total@user:home\n\nprefix=</b> <!-- #EscapedTotal# --><br><b>\ntag=high@user:home\nsuffix=:\nfound it high@user:home\n\nprefix=</b> <!-- #EscapedHigh# --><br><b>\ntag=medium@user:home\nsuffix=:\nfound it medium@user:home\n\nprefix=</b> <!-- #EscapedMedium# --><br><b>\ntag=low@user:home\nsuffix=:\nfound it low@user:home\n\nprefix=</b> <!-- #EscapedLow# --><br><b>\ntag=date_last_scanned@user:home\nsuffix=:\nfound it date_last_scanned@user:home\n

Run Code Online (Sandbox Code Playgroud)\n

归档时间：	12 年，2 月前
查看次数：	179 次
最近记录：	12 年，2 月前