实现使用重复字符计数执行字符串压缩的方法.例如,aabcccccaaaaaaa将成为a2b1c5a7.将字符串解压缩为原始字符串.
我尝试下面的代码,但寻找一些衬垫正则表达式解决方案 -
sub print_word{
my $s=shift;
my @a=split(//, $s);
my $c=1;
my $r='';
my $t=$a[0];
for( my $i=1; $i<=$#a; $i++) {
if($t eq $a[$i]) {
$c++;
}else{
$r.=$t."$c";
$t=$a[$i];
$c=1;
}
}
$r.=$t."$c";
return $r;
}
print print_word('aabcccccaaaaaaa') . "\n";
Run Code Online (Sandbox Code Playgroud)
请在一行中使用正则表达式提供一些东西.
好的,这里的诀窍是 - 将引用与字符串匹配;
my $string = 'aabcccccaaaaaaa';
$string =~ s/((\w)\2*)/ "$2". length ($1) /eg;
print $string;
Run Code Online (Sandbox Code Playgroud)
这给出了:
a2b1c5a7
Run Code Online (Sandbox Code Playgroud)
我们'捕获'一个单词字符(\w),我们\2*用来指零或更多(所以因为第一个字母使它'多一个').
然后我们将其封装在另一个捕获组中,这意味着我们拥有\2或$2作为我们的单个字母,\1或者$1作为同一个字母的子字符串.
我们打印$2然后 - 因为我们e在正则表达式上设置了标志 - 它评估length ( $1 )并插入它.
为了扩展我所说的效率 - 我们需要转到代码分析器.
使用类似的东西Devel::NYTProf:
perl -d:NYTProf script.pl
nytprofhtml --open
Run Code Online (Sandbox Code Playgroud)
您编写的代码:

我的例子

现在,这里有比例问题 - 我的意思是,如果你反复运行,你可能会发现正则表达式解决方案开始"赢".完全使用正则表达式会产生开销,某些正则表达式可能非常"昂贵".请参阅:http://blog.codinghorror.com/regex-performance/
尝试相同的测试 - 例如 - 在循环中运行100,000次,数字开始均匀.
矿:

你:

但我仍然建议 - 在你确定需要之前不要担心性能问题.在那之前,请阅读最容易阅读和理解的内容.
我不确定,直到我对另一个问题的反应进行了灾难性回溯的结果,这就是为什么"小心正规用法"在我的脑海里很高.
它们看起来整洁,而且很聪明,但有时它们有点太聪明了.但在这种情况下,这似乎并不适用.正则表达式引擎有一个开销,但一旦它开始"工作"并运行得很好.
找出正则表达式"聪明"的有用技巧之一就是你可以 use re 'debug';
以我的例子,这打印:
Compiling REx "((\w)\2*)"
Final program:
1: OPEN1 (3)
3: OPEN2 (5)
5: POSIXD[\w] (6)
6: CLOSE2 (8)
8: CURLYX[2] {0,32767} (13)
10: REF2 (12)
12: WHILEM[1/1] (0)
13: NOTHING (14)
14: CLOSE1 (16)
16: END (0)
stclass POSIXD[\w] minlen 1
Matching REx "((\w)\2*)" against "aabcccccaaaaaaa"
Matching stclass POSIXD[\w] against "aabcccccaaaaaaa" (15 bytes)
0 <> <aabcccccaa> | 1:OPEN1(3)
0 <> <aabcccccaa> | 3:OPEN2(5)
0 <> <aabcccccaa> | 5:POSIXD[\w](6)
1 <a> <abcccccaaa> | 6:CLOSE2(8)
1 <a> <abcccccaaa> | 8:CURLYX[2] {0,32767}(13)
1 <a> <abcccccaaa> | 12: WHILEM[1/1](0)
whilem: matched 0 out of 0..32767
1 <a> <abcccccaaa> | 10: REF2: "a"(12)
2 <aa> <bcccccaaaa> | 12: WHILEM[1/1](0)
whilem: matched 1 out of 0..32767
2 <aa> <bcccccaaaa> | 10: REF2: "a"(12)
failed...
whilem: failed, trying continuation...
2 <aa> <bcccccaaaa> | 13: NOTHING(14)
2 <aa> <bcccccaaaa> | 14: CLOSE1(16)
2 <aa> <bcccccaaaa> | 16: END(0)
Match successful!
Matching REx "((\w)\2*)" against "bcccccaaaaaaa"
Matching stclass POSIXD[\w] against "bcccccaaaaaaa" (13 bytes)
2 <aa> <bcccccaaaa> | 1:OPEN1(3)
2 <aa> <bcccccaaaa> | 3:OPEN2(5)
2 <aa> <bcccccaaaa> | 5:POSIXD[\w](6)
3 <aab> <cccccaaaaa> | 6:CLOSE2(8)
3 <aab> <cccccaaaaa> | 8:CURLYX[2] {0,32767}(13)
3 <aab> <cccccaaaaa> | 12: WHILEM[1/1](0)
whilem: matched 0 out of 0..32767
3 <aab> <cccccaaaaa> | 10: REF2: "b"(12)
failed...
whilem: failed, trying continuation...
3 <aab> <cccccaaaaa> | 13: NOTHING(14)
3 <aab> <cccccaaaaa> | 14: CLOSE1(16)
3 <aab> <cccccaaaaa> | 16: END(0)
Match successful!
Matching REx "((\w)\2*)" against "cccccaaaaaaa"
Matching stclass POSIXD[\w] against "cccccaaaaaaa" (12 bytes)
3 <aab> <cccccaaaaa> | 1:OPEN1(3)
3 <aab> <cccccaaaaa> | 3:OPEN2(5)
3 <aab> <cccccaaaaa> | 5:POSIXD[\w](6)
4 <aabc> <ccccaaaaaa> | 6:CLOSE2(8)
4 <aabc> <ccccaaaaaa> | 8:CURLYX[2] {0,32767}(13)
4 <aabc> <ccccaaaaaa> | 12: WHILEM[1/1](0)
whilem: matched 0 out of 0..32767
4 <aabc> <ccccaaaaaa> | 10: REF2: "c"(12)
5 <aabcc> <cccaaaaaaa> | 12: WHILEM[1/1](0)
whilem: matched 1 out of 0..32767
5 <aabcc> <cccaaaaaaa> | 10: REF2: "c"(12)
6 <abccc> <ccaaaaaaa> | 12: WHILEM[1/1](0)
whilem: matched 2 out of 0..32767
6 <abccc> <ccaaaaaaa> | 10: REF2: "c"(12)
7 <bcccc> <caaaaaaa> | 12: WHILEM[1/1](0)
whilem: matched 3 out of 0..32767
7 <bcccc> <caaaaaaa> | 10: REF2: "c"(12)
8 <ccccc> <aaaaaaa> | 12: WHILEM[1/1](0)
whilem: matched 4 out of 0..32767
8 <ccccc> <aaaaaaa> | 10: REF2: "c"(12)
failed...
whilem: failed, trying continuation...
8 <ccccc> <aaaaaaa> | 13: NOTHING(14)
8 <ccccc> <aaaaaaa> | 14: CLOSE1(16)
8 <ccccc> <aaaaaaa> | 16: END(0)
Match successful!
Matching REx "((\w)\2*)" against "aaaaaaa"
Matching stclass POSIXD[\w] against "aaaaaaa" (7 bytes)
8 <ccccc> <aaaaaaa> | 1:OPEN1(3)
8 <ccccc> <aaaaaaa> | 3:OPEN2(5)
8 <ccccc> <aaaaaaa> | 5:POSIXD[\w](6)
9 <ccccca> <aaaaaa> | 6:CLOSE2(8)
9 <ccccca> <aaaaaa> | 8:CURLYX[2] {0,32767}(13)
9 <ccccca> <aaaaaa> | 12: WHILEM[1/1](0)
whilem: matched 0 out of 0..32767
9 <ccccca> <aaaaaa> | 10: REF2: "a"(12)
10 <cccccaa> <aaaaa> | 12: WHILEM[1/1](0)
whilem: matched 1 out of 0..32767
10 <cccccaa> <aaaaa> | 10: REF2: "a"(12)
11 <cccccaaa> <aaaa> | 12: WHILEM[1/1](0)
whilem: matched 2 out of 0..32767
11 <cccccaaa> <aaaa> | 10: REF2: "a"(12)
12 <cccccaaaa> <aaa> | 12: WHILEM[1/1](0)
whilem: matched 3 out of 0..32767
12 <cccccaaaa> <aaa> | 10: REF2: "a"(12)
13 <cccccaaaaa> <aa> | 12: WHILEM[1/1](0)
whilem: matched 4 out of 0..32767
13 <cccccaaaaa> <aa> | 10: REF2: "a"(12)
14 <cccccaaaaaa> <a> | 12: WHILEM[1/1](0)
whilem: matched 5 out of 0..32767
14 <cccccaaaaaa> <a> | 10: REF2: "a"(12)
15 <cccccaaaaaaa> <> | 12: WHILEM[1/1](0)
whilem: matched 6 out of 0..32767
15 <cccccaaaaaaa> <> | 10: REF2: "a"(12)
failed...
whilem: failed, trying continuation...
15 <cccccaaaaaaa> <> | 13: NOTHING(14)
15 <cccccaaaaaaa> <> | 14: CLOSE1(16)
15 <cccccaaaaaaa> <> | 16: END(0)
Match successful!
Matching REx "((\w)\2*)" against ""
Regex match can't succeed, so not even tried
Freeing REx: "((\w)\2*)"
Run Code Online (Sandbox Code Playgroud)
正如您所看到的,它实际上在此示例中做了大量工作.但是因为它不需要在任何时候回溯以匹配你的琴弦,所以它并没有真正浪费任何努力.
| 归档时间: |
|
| 查看次数: |
94 次 |
| 最近记录: |