在Python3中使用“not in”比使用“in”更快吗？

Question

在Python3中使用“not in”比使用“in”更快吗？

假设我们正在解决一个简单的字数统计问题。有一个列表，我们正在尝试查找列表中出现的每个单词的字数。这里哪种模式更快？

book_title =  ['great', 'expectations','the', 'adventures', 'of', 'sherlock','holmes','the','great','gasby','hamlet','adventures','of','huckleberry','fin']
word_count_dict = {}

Run Code Online (Sandbox Code Playgroud)

模式1

for word in book_title:
    if word in word_count_dict:
        word_count_dict[word] += 1
    else:
        word_count_dict[word] = 1

Run Code Online (Sandbox Code Playgroud)

模式2

for word in book_title:
    if word not in word_counter:
        word_counter[word] = 1
    else:
        word_counter[word] += 1

Run Code Online (Sandbox Code Playgroud)

Answer 1

Gre*_*Guy 4

一个是六个，另一个是六个。它们应该大致彼此等效 - 在计算方面，操作not几乎可以忽略不计（实际上是最便宜的可能操作），并且in哈希表（如字典）中的操作以恒定时间运行（哈希值要么存在，要么不存在）。如果我们处理一个列表，它将以线性时间运行，但仍然在和in之间not in。另请参阅python 数据结构的计算复杂性。

所以基本上，使用让你的代码更容易理解的那个。

也就是说，您是否考虑过使用专门collections.Counter为此目的设计的数据结构？

import collections
book_title = ['great', 'expectations','the', 'adventures', 'of', 'sherlock','holmes','the','great','gasby','hamlet','adventures','of','huckleberry','fin']
word_counts = collections.Counter(book_title)
print(word_counts)
# Counter({'great': 2, 'the': 2, 'adventures': 2, 'of': 2, 'expectations': 1, 'sherlock': 1, 'holmes': 1, 'gasby': 1, 'hamlet': 1, 'huckleberry': 1, 'fin': 1})

Run Code Online (Sandbox Code Playgroud)

如果需要，您可以将 a 类型转换collections.Counter为 a ，事实上它是的子类。它甚至有一种专门设计用于与其他计数器配合使用的方法 - 如果您添加另一个书名，只需将其输入到 a 中，然后输入原始的书名即可。dictcollections.Counterdict.update()Counter.update()

归档时间：	5 年，11 月前
查看次数：	1448 次
最近记录：	5 年，11 月前