使用 Biopython 库删除 PDB 中的残留物

Question

使用 Biopython 库删除 PDB 中的残留物

Exc*_*ttu 5 python protein-database biopython

使用biopython库，我想删除列表中列出的残留物，如下所示。该线程（http://pelican.rsvs.ulaval.ca/mediawiki/index.php/Manipulated_PDB_files_using_BioPython）提供了一个去除残留物的示例。我有以下代码来去除残留物

 residue_ids_to_remove = [105, 5, 8, 10, 25, 48]
 structure = pdbparser.get_structure("3chy", "./3chy.pdb")
 first_model = structure[0]
 for chain in first_model:
     for residue in chain:
         id = residue.id
         if id[1] in residue_ids_to_remove:
             chain.detach_child(id[1])
 modified_first_model = first_model

Run Code Online (Sandbox Code Playgroud)

但这段代码不起作用并引发了错误

def detach_child(self, id):
    "Remove a child."
    child=self.child_dict[id]
    KeyError: '105'

Run Code Online (Sandbox Code Playgroud)

这段代码有什么问题？

或者，我可以使用accept_residue()并将其写入PDB。我不想这样跟踪，因为我想在内存中执行此操作以进行进一步处理。

Answer 1

xbe*_*llo 5

Biopython 无法在链的内部字典中找到密钥，因为您提供了随机密钥。该字典看起来像这样：

child_dict = {(' ', 5, ' '): <Residue HOH het=W resseq=5 icode= >,
              (' ', 6, ' '): <Residue HOH het=W resseq=6 icode= >,
              (' ', 7, ' '): <Residue HOH het=W resseq=7 icode= >}

Run Code Online (Sandbox Code Playgroud)

即：使用元组作为字典键。你可以看到字典在做什么print chain.child_dict。

一旦你知道了这一点，错误/解决方案就很清楚了。将有效密钥传递给detach_child，即删除[1]：

   if id[1] in residue_ids_to_remove:
       chain.detach_child(id)

Run Code Online (Sandbox Code Playgroud)

正确的方式

将子级从链级分离，并且不直接循环残基：

for chain in first model:
    for id in residue_ids_to_remove:
        chain.detach_child((' ', id, ' '))

Run Code Online (Sandbox Code Playgroud)

或者使用列表理解：

for chain in first_model:
    [chain.detach_child((' ', id, ' ')) for id in residue_ids_to_remove]

Run Code Online (Sandbox Code Playgroud)

我找到了。当循环列表并更改它时，相邻的残基会在列表中“跳跃”。您必须循环列表的副本：“for残基在列表（链）：”而不是“for残基在链：” (2认同)

归档时间：	11 年，3 月前
查看次数：	2831 次
最近记录：	9 年，8 月前