我正在尝试解决这个生物信息学问题:https://stepic.org/lesson/An-Explosion-of-Hidden-Messages-4/step/1?course= Bioinformatics-Algorithms-2 &unit=8
具体问题在上面链接的第5个窗口,问题是: 大肠杆菌基因组中有多少个不同的9聚体形成(500,3)个团块?(换句话说,不要多次计算9-mer.)
我的代码如下.这是错误的,我想要解释为什么,以及如何改进它(显然O效率很糟糕,但几天前我开始编写Python ...)非常感谢!
genome = '' #insert e. Coli genome here
k = 4 #length of k-mer
L = 50 #size of sliding window
t = 3 #k-mer appears t times
counter = 0
Count = []
for i in range(0,len(genome)-L): #slide window down the genome
pattern = genome[i:i+k] #given this k-mer
for j in range(i,i+L): #calculate k-mer frequency in window of len(L)
if genome[j:j+k] == pattern:
counter = counter + 1
Count.append(counter) …Run Code Online (Sandbox Code Playgroud)