用取代率生成合成DNA序列

nev*_*int 6 algorithm perl bioinformatics dna-sequence

鉴于这些输入:

my $init_seq = "AAAAAAAAAA" #length 10 bp 
my $sub_rate = 0.003;
my $nof_tags = 1000;
my @dna = qw( A C G T );
Run Code Online (Sandbox Code Playgroud)

我想生成:

  1. 一千个长度 - 10个标签

  2. 标签中每个位置的替代率为0.003

产量如下:

AAAAAAAAAA
AATAACAAAA
.....
AAGGAAAAGA # 1000th tags
Run Code Online (Sandbox Code Playgroud)

在Perl中有一种紧凑的方式吗?

我坚持使用这个脚本的逻辑作为核心:

#!/usr/bin/perl

my $init_seq = "AAAAAAAAAA" #length 10 bp 
my $sub_rate = 0.003;
my $nof_tags = 1000;
my @dna = qw( A C G T );

    $i = 0;
    while ($i < length($init_seq)) {
        $roll = int(rand 4) + 1;       # $roll is now an integer between 1 and 4

        if ($roll == 1) {$base = A;}
        elsif ($roll == 2) {$base = T;}
        elsif ($roll == 3) {$base = C;}
        elsif ($roll == 4) {$base = G;};

        print $base;
    }
    continue {
        $i++;
    }
Run Code Online (Sandbox Code Playgroud)

Pen*_*old 5

作为一个小优化,替换:

    $roll = int(rand 4) + 1;       # $roll is now an integer between 1 and 4

    if ($roll == 1) {$base = A;}
    elsif ($roll == 2) {$base = T;}
    elsif ($roll == 3) {$base = C;}
    elsif ($roll == 4) {$base = G;};
Run Code Online (Sandbox Code Playgroud)

    $base = $dna[int(rand 4)];
Run Code Online (Sandbox Code Playgroud)