Sim*_*man 1 python variables dictionary
我写了一个小python程序来迭代数据文件(input_file)并执行计算.如果计算结果达到某些状态(stateA或stateB),则从结果中提取信息(命中).要提取的命中数取决于三个参数集的参数.
我使用词典字典来存储我的参数集(param_sets)和列表来存储命中(命中).字典param_sets和hits具有相同的键.
问题是,
命中词典中的列表以某种方式耦合.当一个列表发生变化时(通过调用extract_hits函数),其他列表也会发生变化.
这里,(缩短)代码:
import os, sys, csv, pdb
from operator import itemgetter
# define three parameter sets
param_sets = {
'A' : {'MIN_LEN' : 8, 'MAX_X' : 0, 'MAX_Z' : 0},
'B' : {'MIN_LEN' : 8, 'MAX_X' : 1, 'MAX_Z' : 5},
'C' : {'MIN_LEN' : 9, 'MAX_X' : 1, 'MAX_Z' : 5}}
# to store hits corresponding to each parameter set
hits = dict.fromkeys(param_sets, [])
# calculations
result = []
for input_values in input_file:
# do some calculations
result = do_some_calculations(result, input_values)
if result == stateA:
for key in param_sets.keys():
hits[key] = extract_hits(key, result,
hits[key],
param_sets[key]['MIN_LEN'],
param_sets[key]['MAX_X'],
param_sets[key]['MAX_Z'])
result = [] # discard results, start empty result list
elif result == stateB:
for key in param_sets.keys():
local_heli[key] = extract_hits(key,
result,
hits[key],
param_sets[key]['MIN_LEN'],
param_sets[key]['MAX_X'],
param_sets[key]['MAX_Z'])
result = [] # discard results
result = some_calculation(input_values) # start new result list
else:
result = some_other_calculation(result) # append result list
def extract_hits(k, seq, hits, min_len, max_au, max_gu):
max_len = len(seq)
for sub_seq_size in reversed(range(min_len, max_len+1)):
for start_pos in range(0,(max_len-sub_seq_size+1)):
from_inc = start_pos
to_exc = start_pos + sub_seq_size
sub_seq = seq[from_inc:to_exc]
# complete information about helical fragment sub_seq
helical_fragment = get_helix_data(sub_seq, max_au, max_gu)
if helical_fragment:
hits.append(helical_fragment)
# search seq regions left and right from sub_seq for further hits
left_seq = seq[0:from_inc]
right_seq = seq[to_exc:max_len]
if len(left_seq) >= min_len:
hits = sub_check_helical(left_seq, hits, min_len, max_au, max_gu)
if len(right_seq) >= min_len:
hits = sub_check_helical(right_seq, hits, min_len, max_au, max_gu)
print 'key', k # just for testing purpose
print 'new', hits # just for testing purpose
print 'frag', helical_fragment # just for testing purpose
pdb.set_trace() # just for testing purpose
return hits # appended
return hits # unchanged
Run Code Online (Sandbox Code Playgroud)
这里,python调试器的一些输出:
key A
new ['x', 'x', 'x', {'y': 'GGCCGGGCUUGGU'}]
frag {'y': 'GGCCGGGCUUGGU'}
>
-> return hits
(Pdb) c
key B
new [{'y': 'GGCCGGGCUUGGU'}, {'y': 'CCGGCCCGAGCCG'}]
frag {'y': 'CCGGCCCGAGCCG'}
> extract_hits()
-> return hits
(Pdb) c
key C
new [{'y': 'GGCCGGGCUUGGU'}, {'y': 'CCGGCCCGAGCCG'}, {'y': 'CCGGCCCG'}]
frag {'y': 'CCGGCCCG'}
> extract_hits()
-> return hits
Run Code Online (Sandbox Code Playgroud)
密钥A中的元素不应出现在密钥B中,密钥A和密钥B中的元素不应出现在密钥C中.
你的路线:
hits = dict.fromkeys(param_sets, [])
Run Code Online (Sandbox Code Playgroud)
相当于:
hits = dict()
onelist = []
for k in param_sets:
hits[k] = onelist
Run Code Online (Sandbox Code Playgroud)
也就是说,每个条目都hits具有SAME列表对象的值,最初为空,无论它具有什么密钥.请记住,赋值不执行隐式副本:而是指定"对RHS对象的另一个引用".
你想要的是:
hits = dict()
for k in param_sets:
hits[k] = []
Run Code Online (Sandbox Code Playgroud)
也就是说,一个NEW AND SEPARATE列表对象作为每个条目的值.同样地,
hits = dict((k, []) for k in param_sets)
Run Code Online (Sandbox Code Playgroud)
顺便说一下,当你确实需要制作容器的(浅)副本时,最通用的方法通常是调用容器的类型,使用旧容器作为参数,如:
newdict = dict(olddict)
newlist = list(oldlist)
newset = set(oldset)
Run Code Online (Sandbox Code Playgroud)
等等; 这也可以在类型之间转换容器(newlist = list(oldset)从一个集合中创建一个列表,等等).