我正在尝试将数百个.fasta文件连接成一个包含所有序列的单个大型fasta文件.我还没有找到在论坛中完成此任务的具体方法.我确实从http://zientzilaria.heroku.com/blog/2007/10/29/merging-single-or-multiple-sequence-fasta-files中找到了这个代码,我已经调整了一下.
Fasta.py包含以下代码:
class fasta:
def __init__(self, name, sequence):
self.name = name
self.sequence = sequence
def read_fasta(file):
items = []
index = 0
for line in file:
if line.startswith(">"):
if index >= 1:
items.append(aninstance)
index+=1
name = line[:-1]
seq = ''
aninstance = fasta(name, seq)
else:
seq += line[:-1]
aninstance = fasta(name, seq)
items.append(aninstance)
return items
Run Code Online (Sandbox Code Playgroud)
以下是连接.fasta文件的改编脚本:
import sys
import glob
import fasta
#obtain directory containing single fasta files for query
filepattern = input('Filename pattern to match: ')
#obtain …Run Code Online (Sandbox Code Playgroud)