use*_*211 2 python list frequency python-2.7
我是python的新手,正在尝试对列表进行排序并获得3个最常用的单词.我到目前为止:
from collections import Counter
reader = open("longtext.txt",'r')
data = reader.read()
reader.close()
words = data.split() # Into a list
uniqe = sorted(set(words)) # Remove duplicate words and sort
for word in uniqe:
print '%s: %s' %(word, words.count(word) ) # words.count counts the words.
Run Code Online (Sandbox Code Playgroud)
这是我的输出,我如何对最频繁的单词进行排序并仅列出第一,第二和第三个常用单词?:
2: 2
3.: 1
3?: 1
New: 1
Python: 5
Read: 1
and: 1
between: 1
choosing: 1
or: 2
to: 1
Run Code Online (Sandbox Code Playgroud)
你可以使用像这样collections.counter的most_common方法
from collections import Counter
with open("longtext.txt", "r") as reader:
c = Counter(line.rstrip() for line in reader)
print c.most_common(3)
Run Code Online (Sandbox Code Playgroud)
从官方文档中引用示例,
>>> Counter('abracadabra').most_common(3)
[('a', 5), ('r', 2), ('b', 2)]
Run Code Online (Sandbox Code Playgroud)
如果你想像问题中所示那样打印它们,你可以简单地迭代最常见的元素并像这样打印它们
for word, count in c.most_common(3):
print "{}: {}".format(word, count)
Run Code Online (Sandbox Code Playgroud)
注意: Counter方法比排序方法更好,因为运行时Counter将在O(N)中,而排序在最坏的情况下需要O(N*log N).
| 归档时间: |
|
| 查看次数: |
3895 次 |
| 最近记录: |