如何查找文本文件中重复次数最多的单词

Question

如何查找文本文件中重复次数最多的单词

代码：

import java.io.File;
import java.util.Scanner; 

class Main {
    public static void main(String[] args) throws Exception{
        //code
        int max = 0;
        int count = 0;
        String rep_word = "none";
        File myfile = new File("rough.txt");
        Scanner reader = new Scanner(myfile);
        Scanner sub_reader = new Scanner(myfile);
        while (reader.hasNextLine()) {
            String each_word = reader.next();
            while (sub_reader.hasNextLine()){
                    String check = sub_reader.next();
                    if (check == each_word){
                        count+=1;
                    }
            }
            if (max<count){
                max = count;
                rep_word = each_word;
            }
          }
        System.out.println(rep_word);  
        reader.close();
        sub_reader.close();
        
    }
}

Run Code Online (Sandbox Code Playgroud)

rough.txt 文件：

我想从文本文件中返回最重复的单词而不使用数组。我没有得到所需的输出。我发现即使变量“check”和“each_word”相同，if 语句也不令人满意，我不明白我哪里出错了。

Answer 1

Ale*_*nko 5

您应该使用映射HashMap来快速有效地计算每个单词的频率，而无需使用两个阅读器重复重新读取输入文件。

为此，Map::merge使用了方法，它还返回单词的当前频率，因此可以立即跟踪最大频率。

int max = 0;
int count = 0;
String rep_word = "none";

// use LinkedHashMap to maintain insertion order
Map<String, Integer> freqMap = new LinkedHashMap<>();

// use try-with-resources to automatically close scanner
try (Scanner reader = new Scanner(new File("rough.txt"))) {
    while (reader.hasNext()) {
        String word = reader.next();
        count = freqMap.merge(word, 1, Integer::sum);
        if (count > max) {
            max = count;
            rep_word = word;
        }
    }
}
System.out.println(rep_word + " repeated " + max + " times");

Run Code Online (Sandbox Code Playgroud)

如果有几个频率相同的单词，则更容易在地图中找到所有这些单词：

for (Map.Entry<String, Integer> entry : freqMap.entrySet()) {
    if (max == entry.getValue()) {
        System.out.println(entry.getKey() + " repeated " + max + " times");  
    }
}

Run Code Online (Sandbox Code Playgroud)

归档时间：	3 年前
查看次数：	117 次
最近记录：	3 年前