Kia*_*ian 5 python regex object bioinformatics dna-sequence
我试图在 DNA 序列中找到一定长度(范围 4 到 12)的匹配
下面是代码:
import re
positions =[]
for i in range(4,12):
for j in range(len(dna)- i+1):
positions.append(re.search(dna[j:j+i],comp_dna))
#Remove the 'None' from the search
position_hits = [x for x in positions if x is not None]
Run Code Online (Sandbox Code Playgroud)
我明白了:
[<_sre.SRE_Match object; span=(0, 4), match='ATGC'>,.........]
Run Code Online (Sandbox Code Playgroud)
如何从跨度和匹配中提取值?我尝试过 .group() 但它抛出一个错误
AttributeError: 'list' object has no attribute 'group'
Run Code Online (Sandbox Code Playgroud)
如果您想修复当前的方法,您可以使用
position_hits = [x.group() for x in positions if x]
Run Code Online (Sandbox Code Playgroud)
您可以直接在循环中获取所有匹配项for:
import re
position_hits = []
for i in range(4,12):
for j in range(len(dna)-i+1):
m = re.search(dna[j:j+i],comp_dna)
position_hits.append(m.group())
Run Code Online (Sandbox Code Playgroud)