我有一个图像列表的字典,属于一类图像,如狗和猫.一些图像包含图像中的狗和猫,我想删除这些图像.
可以说我有
{'cat':[1,2,3], 'dog':[2,3,4]}
Run Code Online (Sandbox Code Playgroud)
我们可以看到id为2和3的图像都有猫和狗.我想要排除这些图像以获得以下内容:
[[1],[4]]
Run Code Online (Sandbox Code Playgroud)
到目前为止我试过这个:
from collections import Counter
img_ids = {'cat':[1,2,3], 'dog':[2,3,4]}
flattened = [item for sublist in img_ids.values() for item in sublist]
flattened_unique = [k for k, v in dict(Counter(flattened)).items() if v < 2]
filtered_ids_dfs = []
for key, val in img_ids.items():
filtered = [x for x in val if x in flattened_unique]
filtered_ids_dfs.append(filtered)
print(filtered_ids_dfs)
Run Code Online (Sandbox Code Playgroud)
对此有更好或更优雅的解决方案吗?也可能有任意数量的类,所以我们的字典可能有猫,狗,鸡等.
首先,计算每个图像有多少个对象(例如猫,狗).然后找到只有一个对象的图像(唯一图像).最后,使用字典理解来查找唯一图像列表中的图像.
from collections import Counter
d = {'cat':[1,2,3], 'dog':[2,3,4], 'chicken': [2, 4, 5, 6]}
c = Counter([item for items in d.values() for item in items])
unique_images = set(k for k, count in c.iteritems() if count == 1) # .items() in Python3
>>> {k: [item for item in items if item in unique_images] for k, items in d.iteritems()} # .items() in Python3
{'cat': [1], 'chicken': [5, 6], 'dog': []}
Run Code Online (Sandbox Code Playgroud)
只需使用套装:
d = {'cat':[1,2,3], 'dog':[2,3,4]}
common = set(d['cat']) & set(d['dog'])
out = [list(set(d['cat']) - common), list(set(d['dog']) - common)]
Run Code Online (Sandbox Code Playgroud)
将其扩展到两个以上的键:
common = set.intersection(*(set(v) for k,v in d.items()))
out = [list(set(v) - common) for k,v in d.items()]
Run Code Online (Sandbox Code Playgroud)