在python中按freq排序单词

Tos*_*osh 0 python sorting list python-3.x

import sys
def candidateWord():
   filePath = "sample.txt"
   file = open(filePath,'r')
   word_count = {}
   for line in sys.stdin.readlines():
         for word in line.split():
            #words = word.lower()
            words = word.strip('!,.?1234567890-=@#$%^&*()_+').lower()
            word_count[words] = word_count.get(words,0) + 1

         for key in word_count.keys():
            #sorted(word, key = str,lower)
            print (str(key)+' '+str(word_count[key]))

candidateWord()
Run Code Online (Sandbox Code Playgroud)

我如何使用我已经拥有的频率按照频率对文本文件中的单词进行排序?

文本文件(sample.txt)包含以下内容: How are you How are you I am good. HBHJKOLDSA How

我的愿望输出应该是:

how 3
am 2
are 2
i 2
you 2
good 1
hbhjkoldsa 1
Run Code Online (Sandbox Code Playgroud)

我在python 3中工作.

Pav*_*sov 5

使用collections.Counter:

from collections import Counter

with open("sample.txt", 'r') as f:
    text = f.read()

words = [w.strip('!,.?1234567890-=@#$%^&*()_+') for w in text.lower().split()]

counter = Counter(words)

print(counter.most_common())
# [('how', 3), ('are', 2), ('you', 2), ('good', 1), ('i', 1), ('am', 1), ('hbhjkoldsa', 1)]
Run Code Online (Sandbox Code Playgroud)

您想要的输出:

print("\n".join("{} {}".format(*p) for p in counter.most_common()))
Run Code Online (Sandbox Code Playgroud)

 

使用您的代码并按(频率desc,word asc)排序:

for key, value in sorted(word_count.items(), key=lambda p: (-p[1], p[0])):
    print("{} {}".format(key, value))
Run Code Online (Sandbox Code Playgroud)

 

计数器结果可以以相同的方式排序,只需替换word_count.items()counter.most_common().