daw*_*awg 26 perl performance idioms operators goatse
"goatse运算符"或=()=
Perl中的习语导致表达式在列表上下文中进行评估.
一个例子是:
my $str = "5 and 4 and a 3 and 2 1 BLAST OFF!!!";
my $count =()= $str =~ /\d/g; # 5 matches...
print "There are $count numbers in your countdown...\n\n";
Run Code Online (Sandbox Code Playgroud)
当我解释使用时,会发生以下情况:
$str =~ /\d/g
匹配所有数字.所述g
开关和列表环境产生的那些匹配的列表.让它成为"List Producer"的例子,在Perl中这可能是很多东西.=()=
原因,分配到一个空列表,所以所有的实际匹配被复制到一个空列表.=()=
标量赋值后,空列表的引用计数变为零.然后Perl删除列表元素的副本.关于效率的问题是:
这个琐碎的列表很有用,但是如果列表是成千上万的匹配怎么办?使用此方法,您将生成每个匹配的完整副本,然后将其删除以计算它们.
Cha*_*ens 24
Perl 5对复制列表非常聪明.它只复制左侧的项目.它的工作原理是因为标量上下文中的列表赋值会产生右侧的项目数.因此,n
正则表达式将创建项目,但它们不会被复制和丢弃,只是被丢弃.您可以在下面的基准测试中看到额外副本在天真案例中所产生的差异.
至于效率,迭代解决方案通常更容易在内存和CPU使用上,但这必须与山羊秘密运营商的简洁性进行权衡.以下是各种解决方案的基准测试结果:
naive: 10
iterative: 10
goatse: 10
for 0 items:
Rate iterative goatse naive
iterative 4365983/s -- -7% -12%
goatse 4711803/s 8% -- -5%
naive 4962920/s 14% 5% --
for 1 items:
Rate naive goatse iterative
naive 749594/s -- -32% -69%
goatse 1103081/s 47% -- -55%
iterative 2457599/s 228% 123% --
for 10 items:
Rate naive goatse iterative
naive 85418/s -- -33% -82%
goatse 127999/s 50% -- -74%
iterative 486652/s 470% 280% --
for 100 items:
Rate naive goatse iterative
naive 9309/s -- -31% -83%
goatse 13524/s 45% -- -76%
iterative 55854/s 500% 313% --
for 1000 items:
Rate naive goatse iterative
naive 1018/s -- -31% -82%
goatse 1478/s 45% -- -75%
iterative 5802/s 470% 293% --
for 10000 items:
Rate naive goatse iterative
naive 101/s -- -31% -82%
goatse 146/s 45% -- -75%
iterative 575/s 470% 293% --
Run Code Online (Sandbox Code Playgroud)
以下是生成它的代码:
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark;
my $s = "a" x 10;
my %subs = (
naive => sub {
my @matches = $s =~ /a/g;
return scalar @matches;
},
goatse => sub {
my $count =()= $s =~ /a/g;
return $count;
},
iterative => sub {
my $count = 0;
$count++ while $s =~ /a/g;
return $count;
},
);
for my $sub (keys %subs) {
print "$sub: @{[$subs{$sub}()]}\n";
}
for my $n (0, 1, 10, 100, 1_000, 10_000) {
$s = "a" x $n;
print "\nfor $n items:\n";
Benchmark::cmpthese -1, \%subs;
}
Run Code Online (Sandbox Code Playgroud)
Eri*_*rom 13
在您的特定示例中,基准测试非常有用:
my $str = "5 and 4 and a 3 and 2 1 BLAST OFF!!!";
use Benchmark 'cmpthese';
cmpthese -2 => {
goatse => sub {
my $count =()= $str =~ /\d/g;
$count == 5 or die
},
while => sub {
my $count;
$count++ while $str =~ /\d/g;
$count == 5 or die
},
};
Run Code Online (Sandbox Code Playgroud)
返回:
Rate goatse while
goatse 285288/s -- -57%
while 661659/s 132% --
Run Code Online (Sandbox Code Playgroud)
将$str =~ /\d/g
在列表环境中捕捉,即使它不需要匹配的子字符串.该while
示例在标量(布尔)上下文中具有正则表达式,因此正则表达式引擎只需返回true或false,而不是实际匹配.
通常,如果您有一个列表生成函数并且只关心项目数,那么编写一个简短count
函数会更快:
sub make_list {map {$_**2} 0 .. 1000}
sub count {scalar @_}
use Benchmark 'cmpthese';
cmpthese -2 => {
goatse => sub {my $count =()= make_list; $count == 1001 or die},
count => sub {my $count = count make_list; $count == 1001 or die},
};
Run Code Online (Sandbox Code Playgroud)
这使:
Rate goatse count
goatse 3889/s -- -26%
count 5276/s 36% --
Run Code Online (Sandbox Code Playgroud)
我猜测sub为什么更快是因为子程序调用被优化为传递列表而不复制它们(作为别名传递).