Tor*_*den 10 python dictionary list
朋友们,我有一个词典列表:
my_list =
[
{'oranges':'big','apples':'green'},
{'oranges':'big','apples':'green','bananas':'fresh'},
{'oranges':'big','apples':'red'},
{'oranges':'big','apples':'green','bananas':'rotten'}
]
Run Code Online (Sandbox Code Playgroud)
我想创建一个新的列表,消除部分重复.
在我的情况下,必须删除这本词典:
{'oranges':'big','apples':'green'}
Run Code Online (Sandbox Code Playgroud)
,因为它复制了较长的词典:
{'oranges':'big','apples':'green','bananas':'fresh'}
{'oranges':'big','apples':'green','bananas':'rotten'}
Run Code Online (Sandbox Code Playgroud)
因此,期望的结果:
[
{'oranges':'big','apples':'green','bananas':'fresh'},
{'oranges':'big','apples':'red'},
{'oranges':'big','apples':'green','bananas':'rotten'}
]
Run Code Online (Sandbox Code Playgroud)
怎么做?太感谢了!
第一个[井,第二,有一些编辑..]想到的事情是这样的:
def get_superdicts(dictlist):
superdicts = []
for d in sorted(dictlist, key=len, reverse=True):
fd = set(d.items())
if not any(fd <= k for k in superdicts):
superdicts.append(fd)
new_dlist = map(dict, superdicts)
return new_dlist
Run Code Online (Sandbox Code Playgroud)
这使:
>>> a = [{'apples': 'green', 'oranges': 'big'}, {'apples': 'green', 'oranges': 'big', 'bananas': 'fresh'}, {'apples': 'red', 'oranges': 'big'}, {'apples': 'green', 'oranges': 'big', 'bananas': 'rotten'}]
>>>
>>> get_superdicts(a)
[{'apples': 'red', 'oranges': 'big'},
{'apples': 'green', 'oranges': 'big', 'bananas': 'rotten'},
{'bananas': 'fresh', 'oranges': 'big', 'apples': 'green'}]
Run Code Online (Sandbox Code Playgroud)
[原来我在frozenset
这里使用,认为我可以做一些聪明的设置操作,但显然没有提出任何东西.]
尝试下面的实现
请注意,在我的实现中,我仅预排序并选择 2 对组合以减少迭代次数。这将确保钥匙的大小始终小于或等于干草的大小
>>> my_list =[
{'oranges':'big','apples':'green'},
{'oranges':'big','apples':'green','bananas':'fresh'},
{'oranges':'big','apples':'red'},
{'oranges':'big','apples':'green','bananas':'rotten'}
]
#Create a function remove_dup, name it anything you want
def remove_dup(lst):
#import combinations for itertools, mainly to avoid multiple nested loops
from itertools import combinations
#Create a generator function dup_gen, name it anything you want
def dup_gen(lst):
#Now read the dict pairs, remember key is always shorter than hay in length
for key, hay in combinations(lst, 2):
#if key is in hay then set(key) - set(hay) = empty set
if not set(key) - set(hay):
#and if key is in hay, yield it
yield key
#sort the list of dict based on lengths after converting to a item tuple pairs
#Handle duplicate elements, thanks to DSM for pointing out this boundary case
#remove_dup([{1:2}, {1:2}]) == []
lst = sorted(set(tuple(e.items()) for e in lst), key = len)
#Now recreate the dictionary from the set difference of
#the original list and the elements generated by dup_gen
#Elements generated by dup_gen are the duplicates that needs to be removed
return [dict(e) for e in set(lst) - set(dup_gen(lst))]
remove_dup(my_list)
[{'apples': 'green', 'oranges': 'big', 'bananas': 'fresh'}, {'apples': 'green', 'oranges': 'big', 'bananas': 'rotten'}, {'apples': 'red', 'oranges': 'big'}]
remove_dup([{1:2}, {1:2}])
[{1: 2}]
remove_dup([{1:2}])
[{1: 2}]
remove_dup([])
[]
remove_dup([{1:2}, {1:3}])
[{1: 2}, {1: 3}]
Run Code Online (Sandbox Code Playgroud)
更快的实施
def remove_dup(lst):
#sort the list of dict based on lengths after converting to a item tuple pairs
#Handle duplicate elements, thanks to DSM for pointing out this boundary case
#remove_dup([{1:2}, {1:2}]) == []
lst = sorted(set(tuple(e.items()) for e in lst), key = len)
#Generate all the duplicates
dups = (key for key, hay in combinations(lst, 2) if not set(key).difference(hay))
#Now recreate the dictionary from the set difference of
#the original list and the duplicate elements
return [dict(e) for e in set(lst).difference(dups)]
Run Code Online (Sandbox Code Playgroud)