如何以未展开形式打印展开的正则表达式?

Mic*_*man 7 regex perl

是否可以以qr/.../x非扩展形式打印使用扩展符号()创建的正则表达式?例如:

my $decimal = qr/
  (?=\d|\.\d)  # look-ahead to ensure at least one of the optional parts matches
  \d*          # optional whole digits
  (?:\.\d*)?   # optional decimal point and fractional digits
/x;

say $decimal;
Run Code Online (Sandbox Code Playgroud)

我希望将其打印为(?=\d|\.\d)\d*(?:\.\d*)?

我可以编写一个解析器来剥离非功能性部分,但这将复制perl已经执行的操作,而且我可能会弄错一些非平凡的情况。

(是的,这似乎有点愚蠢。我有一个用例,需要打印很多消息,例如,matched <pattern>并且我希望将消息限制为一行,同时允许将扩展的符号用于模式。)

ike*_*ami 7

Perl doesn't provide such a utility. It parses regex patterns; it doesn't generate them. The stringification of the object is the exact string provided to the parser, wrapped in a (?:...) that accounts for the flags. The string provided to the parser is the post-interpolation literal minus the delimiters.[1]

That said, this would be trivial to do with a regex parser.

There is YAPE::Regex, but it hasn't been updated in a long time. For example, it doesn't support the (?^:...) found in the stringification of regex in modern version of Perl.

There is also Regexp::Parser. It's newer, but it doesn't support (?^:...) either! But if we were to work around that, it would be be perfect since naturally ignores whitespace and comments! All we need to do is parse the pattern and get a stringifiction from the parse tree.

Finally, there's Regexp::Parsertron. It's the newest, and it does support (?^:...), but it doesn't distinguish whitespace and comments from "exact matches" tokens.

So let's use Regexp::Parser.[2]

#!/usr/bin/perl
use strict;
use warnings;
use feature qw( say );

use Regexp::Parser qw( );

{
   @ARGV == 1
      or die("usage\n");

   my $re = $ARGV[0];

   # R::P doesn't support «(?^:...)», so we'll
   # provide a backwards-compatible stringification.
   $re =~ s{^\(\?\^(\w*):}{
      my %on = map { $_ => 1 } split //, $1;
      my $on  = join "", grep  $on{$_}, qw( i m s x );
      my $off = join "", grep !$on{$_}, qw( i m s x );
      "(?$on-$off:"
   }e;

   my $parser = Regexp::Parser->new($re);
   my $roots = $parser->root
      or die($parser->errmsg);

   say join "", map $_->visual, @$roots;
}
Run Code Online (Sandbox Code Playgroud)

Test:

#!/usr/bin/perl
use strict;
use warnings;
use feature qw( say );

use Regexp::Parser qw( );

{
   @ARGV == 1
      or die("usage\n");

   my $re = $ARGV[0];

   # R::P doesn't support «(?^:...)», so we'll
   # provide a backwards-compatible stringification.
   $re =~ s{^\(\?\^(\w*):}{
      my %on = map { $_ => 1 } split //, $1;
      my $on  = join "", grep  $on{$_}, qw( i m s x );
      my $off = join "", grep !$on{$_}, qw( i m s x );
      "(?$on-$off:"
   }e;

   my $parser = Regexp::Parser->new($re);
   my $roots = $parser->root
      or die($parser->errmsg);

   say join "", map $_->visual, @$roots;
}
Run Code Online (Sandbox Code Playgroud)
  1. \Q\u以及类似的在插补同一阶段完成。\N{...}决心\N{U+...}以不朽当前charnames设置。其它逃逸如\x27\x{0000027}\\\/保留字符的字符。

  2. 在此答案的早期版本中使用了基于YAPE :: Regex的解决方案。