我正在尝试优化一些代码来处理列表列表,我注意到当我尝试在列表中指定一个列表时,我会一直遇到语法或输出错误.
我的代码如下
out = []
for cluster in ClusterFile:
cluster = list(cluster)
for term in cluster[3]:
for item in Interest:
if term == item[0]:
x = [item[1]]
cluster.append(x)
break
out.append(cluster)
break
Run Code Online (Sandbox Code Playgroud)
我的许多尝试之一:
out = [([item[1]]) for item in Interest for term in cluster[3] if term ==item[0] for cluster in ClusterFile]
Run Code Online (Sandbox Code Playgroud)
输入示例:
cluster = [['a'], [1, 2], [3, 4], [['w'], ['x'], ['y'], ['z']], [5, 6]]
Interest = [['w', 'qx12'], ['y', 'qx19']]
Run Code Online (Sandbox Code Playgroud)
示例输出:
[['a'], [1, 2], [3, 4], [['w'], ['x'], ['y'], ['z']], …Run Code Online (Sandbox Code Playgroud) 我有一个来自旧 OrthoMCL 进程的大量直向同源基因的边缘列表文件(~80 GB)。我想解析边缘列表中的所有派(所有顶点彼此共享一条边的子图),然后将每个派折叠为一行,同时忽略约简(例如 GeneA,GeneB <-> GeneB,GeneA)和 self命中 (GeneA <-> GeneA)。我正在尝试使用Python的networkX(find_cliques),但我是一个缺乏经验的程序员,所以我没有得到理想的输出。如果有人有任何使用网络结构的经验,您能给我指出正确的方向吗?
这是一个输入示例:
GeneA,GeneA
GeneA,GeneB
GeneA,GeneC
GeneB,GeneA
GeneB,GeneB
GeneB,GeneC
GeneC,GeneA
GeneC,GeneB
GeneC,GeneC
GeneD,GeneD
GeneD,GeneE
GeneD,GeneF
GeneE,GeneD
GeneE,GeneE
GeneE,GeneF
GeneF,GeneD
GeneF,GeneE
GeneF,GeneF
GeneH,GeneH
GeneH,GeneI
GeneH,GeneJ
GeneH,GeneK
GeneH,GeneL
GeneH,GeneM
GeneH,GeneN
GeneH,GeneO
GeneH,GeneP
GeneH,GeneQ
GeneI,GeneH
GeneI,GeneI
GeneI,GeneJ
GeneI,GeneK
GeneI,GeneL
GeneI,GeneM
GeneI,GeneN
GeneI,GeneO
GeneI,GeneP
GeneI,GeneQ
GeneJ,GeneH
GeneJ,GeneI
GeneJ,GeneJ
GeneJ,GeneK
GeneJ,GeneL
GeneJ,GeneM
GeneJ,GeneN
GeneJ,GeneO
GeneJ,GeneP
GeneJ,GeneQ
GeneK,GeneH
GeneK,GeneI
GeneK,GeneJ
GeneK,GeneK
GeneK,GeneL
GeneK,GeneM
GeneK,GeneN
GeneK,GeneO
GeneK,GeneP
GeneK,GeneQ
GeneL,GeneH
GeneL,GeneI
GeneL,GeneJ
GeneL,GeneK
GeneL,GeneL
GeneL,GeneM
GeneL,GeneN
GeneL,GeneO
GeneL,GeneP
GeneL,GeneQ …Run Code Online (Sandbox Code Playgroud) 我已经进行了超几何分析(使用python脚本)来研究基因子集中GO术语的富集.我输出的一个例子如下:
GO00001 1500 300 200 150 5.39198144708e-77
GO00002 1500 500 400 350 1.18917839281e-160
GO00003 1500 400 350 320 9.48402847878e-209
GO00004 1500 100 100 75 3.82935778527e-82
GO00005 1500 100 80 80 2.67977253966e-114
Run Code Online (Sandbox Code Playgroud)
哪里
Column1 = GO ID
Column2 = Total sum of all terms in the original dataset
Column3 = Total sum of [Column 1] IDs in the original dataset
Column4 = Sum of all terms in the subset
Column5 = Sum of [Column 1] IDs in subset
Column6 = pvalue …Run Code Online (Sandbox Code Playgroud) 我正在编写一个python代码,我试图找出哪个函数最好执行以下任务:
我想说"如果一行中的单词数不等于1,那么就做"
#Some code
words = line.split("\t")
if sum(words) != 1
#Do something
#Some code
words = line.split("\t")
if int(words) != 1
#Do something
#Some code
words = line.split("\t")
if len(words) != 1
#Do something
Run Code Online (Sandbox Code Playgroud)
所有这些命令都返回以下错误:
File "test.py", line 10
if len(words) != 1
^
SyntaxError: invalid syntax
Run Code Online (Sandbox Code Playgroud)
有人可以帮忙吗?