perl中的字符串匹配,查找匹配数

AWR*_*RAM 0 regex string perl pattern-matching matching

如何number of of each 2 consecutive characters AA, AC,AG,AT,CC,CA...在这样的序列中找到:

$sequence = 'AACGTACTGACGTACTGGTTGGTACGA' 
Run Code Online (Sandbox Code Playgroud)

不允许重叠,即$序列包含从左到右AA CG TA CT ....而不是AA AC CG ......

Fai*_*Dev 5

@result = $subject =~ m/[ACTG][ATGC]/g;

print scalar(@result);
Run Code Online (Sandbox Code Playgroud)

编辑,因为您完全改变了您的问题:

use strict;

my $subject = "AACGTACTGACGTACTGGTTGGTACGA";
my %results = ();
while ($subject =~ m/[ACTG][ATGC]/g) {
    # matched text = $&
        if(exists $results{$&})
        {
            $results{$&}++ 
        }
        else
        {
            $results{$&} = 1;
        }
}

foreach (sort keys %results) {
    print "$_ : $results{$_}\n";
  }
Run Code Online (Sandbox Code Playgroud)

输出:

AA : 1
CG : 3
CT : 2
GA : 1
GG : 2
TA : 3
TT : 1
Run Code Online (Sandbox Code Playgroud)

最终编辑:希望...感谢@canavanin

use strict;

my $subject = "AACGTACTGACGTACTGGTTGGTACGA";
my %results = ();
while ($subject =~ m/[ACTG][ATGC]/g) {
    # matched text = $&
    $results{$&}++ 
}

foreach (sort keys %results) {
    print "$_ : $results{$_}\n";
  }
Run Code Online (Sandbox Code Playgroud)

  • @FailedDev你不需要if块.感谢autovivification,你可以用$ results {$&} ++替换整个块. (3认同)