为什么.*消耗这个Perl正则表达式中的整个字符串?

cha*_*par 11 regex perl

为什么第一个print语句没有输出我期望的内容:

first = This is a test string, sec = This is a test string 
Run Code Online (Sandbox Code Playgroud)

由于*和+都是贪婪的,为什么内部*即在"("中的第一个匹配中不消耗整个字符串?

use strict;
use warnings;

my $string = "This is a test string";
$string =~ /((.*)*)/; 
print "first = $1, sec = $2\n";  #prints "first = This is a test string, sec ="

$string =~ /((.+)*)/;
print "first = $1, sec = $2\n";  #prints "first = This is a test string, sec = This is a test string"
Run Code Online (Sandbox Code Playgroud)

sep*_*p2k 17

在第一个正则表达式.*匹配两次.第一次匹配整个字符串.第二次匹配末尾的空字符串,因为.*当没有其他内容匹配时匹配空字符串.

其他正则表达式不会发生这种情况,因为.+无法匹配空字符串.

编辑:至于什么地方:$ 2将包含上次.*/ .+应用的匹配内容.$ 1将包含与(.*)*/ 匹配的内容(.+)*,即整个字符串.


Bra*_*ert 14

用" use re 'debug'" 运行它会导致:

Compiling REx "((.*)*)"
Final program:
   1: OPEN1 (3)
   3:   CURLYX[0] {0,32767} (12)
   5:     OPEN2 (7)
   7:       STAR (9) # <====
   8:         REG_ANY (0)
   9:     CLOSE2 (11)
  11:   WHILEM[1/1] (0)
  12:   NOTHING (13)
  13: CLOSE1 (15)
  15: END (0)
minlen 0 
Run Code Online (Sandbox Code Playgroud)

Matching REx "((.*)*)" against "This is a test string"
   0 <> <This is a >         |  1:OPEN1(3)
   0 <> <This is a >         |  3:CURLYX[0] {0,32767}(12)
   0 <> <This is a >         | 11:  WHILEM[1/1](0)
                                    whilem: matched 0 out of 0..32767
   0 <> <This is a >         |  5:    OPEN2(7)
   0 <> <This is a >         |  7:    STAR(9) # <====
                                      REG_ANY can match 21 times out of 2147483647...
  21 < test string> <>       |  9:      CLOSE2(11)
  21 < test string> <>       | 11:      WHILEM[1/1](0)
                                        whilem: matched 1 out of 0..32767
  21 < test string> <>       |  5:        OPEN2(7)
  21 < test string> <>       |  7:        STAR(9) # <====

  # This is where the outputs really start to diverge
  # --------------------------------------------------------------------------------------------
                                          REG_ANY can match 0 times out of 2147483647...
  21 < test string> <>       |  9:          CLOSE2(11) # <==== Succeeded
  21 < test string> <>       | 11:          WHILEM[1/1](0)
                                            whilem: matched 2 out of 0..32767
                                            whilem: empty match detected, trying continuation...
  # --------------------------------------------------------------------------------------------

  21 < test string> <>       | 12:            NOTHING(13)
  21 < test string> <>       | 13:            CLOSE1(15)
  21 < test string> <>       | 15:            END(0)
Match successful!
Run Code Online (Sandbox Code Playgroud)

Compiling REx "((.+)*)"
Final program:
   1: OPEN1 (3)
   3:   CURLYX[0] {0,32767} (12)
   5:     OPEN2 (7)
   7:       PLUS (9) # <====
   8:         REG_ANY (0)
   9:     CLOSE2 (11)
  11:   WHILEM[1/1] (0)
  12:   NOTHING (13)
  13: CLOSE1 (15)
  15: END (0)
minlen 0 
Run Code Online (Sandbox Code Playgroud)

Matching REx "((.+)*)" against "This is a test string"
   0 <> <This is a >         |  1:OPEN1(3)
   0 <> <This is a >         |  3:CURLYX[0] {0,32767}(12)
   0 <> <This is a >         | 11:  WHILEM[1/1](0)
                                    whilem: matched 0 out of 0..32767
   0 <> <This is a >         |  5:    OPEN2(7)
   0 <> <This is a >         |  7:    PLUS(9) # <====
                                      REG_ANY can match 21 times out of 2147483647...
  21 < test string> <>       |  9:      CLOSE2(11)
  21 < test string> <>       | 11:      WHILEM[1/1](0)
                                        whilem: matched 1 out of 0..32767
  21 < test string> <>       |  5:        OPEN2(7)
  21 < test string> <>       |  7:        PLUS(9) # <====

  # This is where the outputs really start to diverge
  # ------------------------------------------------------------------------------------
                                          REG_ANY can match 0 times out of 2147483647...
                                          failed... # <==== Failed
                                        whilem: failed, trying continuation...
  # ------------------------------------------------------------------------------------

  21 < test string> <>       | 12:        NOTHING(13)
  21 < test string> <>       | 13:        CLOSE1(15)
  21 < test string> <>       | 15:        END(0)
Match successful!
Run Code Online (Sandbox Code Playgroud)