python中的单词频率不起作用

Question

python中的单词频率不起作用

我试图使用python计算文本文件中的单词频率.

我使用以下代码:

openfile=open("total data", "r")

linecount=0
for line in openfile:
    if line.strip():
        linecount+=1

count={}

while linecount>0:
    line=openfile.readline().split()
    for word in line:
        if word in count:
            count[word]+=1
        else:
            count[word]=1
    linecount-=1

print count

Run Code Online (Sandbox Code Playgroud)

但我得到一本空字典."print count"给出{}作为输出

我也试过用:

from collections import defaultdict
.
.
count=defaultdict(int)
.
.
     if word in count:
          count[word]=count.get(word,0)+1

Run Code Online (Sandbox Code Playgroud)

但我又得到了一本空字典.我不明白我做错了什么.有人可以指出吗？

Answer 1

Ash*_*ary 9

此循环for line in openfile:将文件指针移动到文件的末尾.因此,如果您想再次读取数据,请将指针(openfile.seek(0))移动到文件的开头或重新打开文件.

为了更好地使用单词频率Collections.Counter:

from collections import Counter
with open("total data", "r") as openfile:
   c = Counter()
   for line in openfile:
      words = line.split()
      c.update(words)

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，7 月前
查看次数：	1111 次
最近记录：	12 年，7 月前