计算列表中非唯一项的总数

Question

计算列表中非唯一项的总数

Tim*_*ces 2 python optimization list duplicates

我正在寻找最省时的方法来计算 Python 的大列表（大约 100,000 个项目）中非唯一项目的数量。

我到目前为止的方法：

original_list = [1, 4, 6, 2, 2, 1, 5, 3, 2]

duplicates_list = []
for item in original_list:
    if original_list.count(item) > 1:
        duplicates_list.append(item)

duplicates_count = len(duplicates_list)

print(duplicates_count)

# Should give the following answer:
5

Run Code Online (Sandbox Code Playgroud)

目前，大约 70-80K 项的大列表需要 1-2 分钟来执行计算。我想知道我们是否可以尽可能减少计算所需的时间（可能减少到 3-10 秒）。

我真的很感谢所有的帮助！

Answer 1

Dav*_*uck 5

Counter 对象应该更快，因为在您的版本中，您正在调用count()列表中的每个项目，因此每个问题 100,000 次。这将对整个列表执行 Count() 一次，然后对 Counter 对象进行迭代，每个唯一值只会执行一次。

original_list = [1, 4, 6, 2, 2, 1, 5, 3, 2]

from collections import Counter
count = Counter(original_list)

dupes = sum(v for k, v in count.items() if v > 1)

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，8 月前
查看次数：	294 次
最近记录：	5 年，5 月前