删除列表中字符串的所有扩展名

lam*_*iji 5 python dictionary python-3.x

我有词典如:

'1' : ['GAA', 'GAAA', 'GAAAA', 'GAAAAA', 'GAAAAAG', 'GAAAAAGU', 'GAAAAAGUA', 'GAAAAAGUAU', 'GAAAAAGUAUG', 'GAAAAAGUAUGC', 'GAAAAAGUAUGCA', 'GAAAAAGUAUGCAA', 'GAAAAAGUAUGCAAG', 'GAAAAAGUAUGCAAGA', 'GAAAAAGUAUGCAAGAA', 'GAAAAAGUAUGCAAGAAC']

'2' : ['GAG', 'GAGA', 'GAGAG', 'GAGAGA', 'GAGAGAG', 'GAGAGAGA', 'GAGAGAGAC', 'GAGAGAGACA', 'GAGAGAGACAU', 'GAGAGAGACAUA', 'GAGAGAGACAUAG', 'GAGAGAGACAUAGA', 'GAGAGAGACAUAGAG', 'GAGAGAGACAUAGAGG']

'3' : ['GUC', 'GUCU', 'GUCUU', 'GUCUUU', 'GUCUUUG', 'GUCUUUGU', 'GUCUUUGU"', 'GUCUUUGU"G', 'GUCUUUGU"GU', 'GUCUUUGU"GUA', 'GUCUUUGU"GUAC', 'GUCUUUGU"GUACA', 'GUCUUUGU"GUACAU', 'GUCUUUGU"GUACAUC']
Run Code Online (Sandbox Code Playgroud)

我试图让程序可以找到列表中最短的子字符串(例如第一个中的GAA)并使用它来查找所有其他只是GAA扩展的条目(以GAA开头的字符串,只是有额外的字母)并删除它们.

我知道这里有很多关于如何从列表中删除项目的问题,但没有人帮我解决这个问题.

Ayu*_*ush 4

>>> dictionary={ '1': ['GAA', 'GAAA', 'GAAAA', 'GAAAAA', 'GAAAAAG', 'GAAAAAGU',
                    'GAAAAAGUA', 'GAAAAAGUAU', 'GAAAAAGUAUG', 'GAAAAAGUAUGC', 
                    'GAAAAAGUAUGCA', 'GAAAAAGUAUGCAA', 'GAAAAAGUAUGCAAG', 
                    'GAAAAAGUAUGCAAGA', 'GAAAAAGUAUGCAAGAA', 'GAAAAAGUAUGCAAGAAC', 
                    'RTRSRS','GAG', 'GAGA', 'GAGAG', 'GAGAGA', 'GAGAGAG', 'GAGAGAGA',
                  'GAGAGAGAC', 'GAGAGAGACA', 'GAGAGAGACAU', 'GAGAGAGACAUA', 
                  'GAGAGAGACAUAG', 'GAGAGAGACAUAGA', 'GAGAGAGACAUAGAG',
                  'GAGAGAGACAUAGAGG']}
>>> new_dict = {}

>>> for i in dictionary:
        l = len(min(dictionary[i], key=len))
        m = [x for x in dictionary[i] if len(x)==l]
        temp = []
        temp.extend(m)
        for k in dictionary[i]:
            if not any(map(lambda j: k.startswith(j), m)):
                temp.append(k)
        new_dict[i] = temp

>>> print(new_dict)
# {'1': ['GAA', 'GAG', 'RTRSRS']}
Run Code Online (Sandbox Code Playgroud)