Biopython本地BLAST数据库错误

pri*_*hah 3 python database path biopython blast

我试图使用Biopython的NcbiblastxCommandline工具在"nr"数据库本地运行blastx但是我总是得到关于蛋白质数据库搜索路径的以下错误:

>>> from Bio.Blast.Applications import NcbiblastxCommandline
>>> nr = "/Users/Priya/Documents/Python/ncbi-blast-2.2.26+/bin/nr.pal"
>>> infile = "/Users/Priya/Documents/Python/Tutorials/opuntia.txt"
>>> blastx = "/Users/Priya/Documents/Python/ncbi-blast-2.2.26+/bin/blastx"
>>> outfile = "/Users/Priya/Documents/Python/Tutorials/opuntia_python_local.xml"
>>> blastx_cline = NcbiblastxCommandline(blastx, query = infile, db = nr, evalue = 0.001, out = outfile)
>>> stdout, stderr = blastx_cline()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File     "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Bio/Application/__init__.py", line 443, in __call__
stdout_str, stderr_str)
Bio.Application.ApplicationError: Command '/Users/Priya/Documents/Python/ncbi-blast-2.2.26+/bin/blastx -out /Users/Priya/Documents/Python/Tutorials/opuntia_python_local.xml -query /Users/Priya/Documents/Python/Tutorials/opuntia.txt -db /Users/Priya/Documents/Python/ncbi-blast-2.2.26+/bin/nr.pal -evalue 0.001' returned non-zero exit status 2, 'BLAST Database error: No alias or index file found for protein database [/Users/Priya/Documents/Python/ncbi-blast-2.2.26+/bin/nr.pal] in search path [/Users/Priya::]'
Run Code Online (Sandbox Code Playgroud)

我不知道如何更改指向我下载的nr数据库的路径,但我认为我正确指出它,因为我可以从命令行运行此代码而没有任何问题:

Priyas-iMac:~ Priya$ /Users/priya/Documents/Python/ncbi-blast-2.2.26+/bin/blastx -query /Users/priya/Documents/Python/Tutorials/opuntia.txt -db /Users/priya/Documents/Python/ncbi-blast-2.2.26+/bin/nr -out /Users/priya/Documents/Python/Tutorials/opuntia_local.xml -evalue 0.001 -outfmt 5
Run Code Online (Sandbox Code Playgroud)

上面的命令行代码创建了一个我期望的blast结果的xml文件.

使用Biopython NCBI命令行工具解决这个问题的任何帮助将不胜感激!

bow*_*bow 5

你的nr变量以nr.pal.结尾.nr(没有.pal)应该没事.如果删除pal不起作用.您可以尝试.ncbirc在主目录中设置一个文件,其中包含:

[BLAST]
BLASTDB=/directory/path/to/blast/databases
Run Code Online (Sandbox Code Playgroud)

它基本上为blast数据库查找设置了一个环境变量.之后,您只需nrnr变量中使用(无需路径).

顺便说一句,您可以检查NcbiblastxCommandline使用构造的命令行print blastx_cline.我的猜测是它与您手动输入的不一样.

编辑:查看http://www.biostars.org/,了解与StackExchange格式类似的生物信息学特定问题.