Bre*_*dan 10 java string hashtable binary-search
我正在尝试实现一个程序,它将接受用户输入,将该字符串拆分为标记,然后在字典中搜索该字符串中的单词.我解析字符串的目标是让每个标记都是英文单词.
例如:
Input:
aman
Split Method:
a man
a m an
a m a n
am an
am a n
ama n
Desired Output:
a man
Run Code Online (Sandbox Code Playgroud)
我目前有这个代码,它可以完成所有操作直到所需的输出部分:
import java.util.Scanner;
import java.io.*;
public class Words {
public static String[] dic = new String[80368];
public static void split(String head, String in) {
// head + " " + in is a segmentation
String segment = head + " " + in;
// count number of dictionary words
int count = 0;
Scanner phraseScan = new Scanner(segment);
while (phraseScan.hasNext()) {
String word = phraseScan.next();
for (int i=0; i<dic.length; i++) {
if (word.equalsIgnoreCase(dic[i])) count++;
}
}
System.out.println(segment + "\t" + count + " English words");
// recursive calls
for (int i=1; i<in.length(); i++) {
split(head+" "+in.substring(0,i), in.substring(i,in.length()));
}
}
public static void main (String[] args) throws IOException {
Scanner scan = new Scanner(System.in);
System.out.print("Enter a string: ");
String input = scan.next();
System.out.println();
Scanner filescan = new Scanner(new File("src:\\dictionary.txt"));
int wc = 0;
while (filescan.hasNext()) {
dic[wc] = filescan.nextLine();
wc++;
}
System.out.println(wc + " words stored");
split("", input);
}
}
Run Code Online (Sandbox Code Playgroud)
我知道有更好的方法来存储字典(例如二叉搜索树或哈希表),但我不知道如何实现它们.
我坚持如何实现一个方法,该方法将检查拆分字符串,以查看每个段是否是字典中的单词.
任何帮助都会很棒,谢谢
Whi*_*g34 16
如果要支持20个或更多字符,则以可能的方式拆分输入字符串将无法在合理的时间内完成.这是一种更有效的方法,内联评论:
public static void main(String[] args) throws IOException {
// load the dictionary into a set for fast lookups
Set<String> dictionary = new HashSet<String>();
Scanner filescan = new Scanner(new File("dictionary.txt"));
while (filescan.hasNext()) {
dictionary.add(filescan.nextLine().toLowerCase());
}
// scan for input
Scanner scan = new Scanner(System.in);
System.out.print("Enter a string: ");
String input = scan.next().toLowerCase();
System.out.println();
// place to store list of results, each result is a list of strings
List<List<String>> results = new ArrayList<>();
long time = System.currentTimeMillis();
// start the search, pass empty stack to represent words found so far
search(input, dictionary, new Stack<String>(), results);
time = System.currentTimeMillis() - time;
// list the results found
for (List<String> result : results) {
for (String word : result) {
System.out.print(word + " ");
}
System.out.println("(" + result.size() + " words)");
}
System.out.println();
System.out.println("Took " + time + "ms");
}
public static void search(String input, Set<String> dictionary,
Stack<String> words, List<List<String>> results) {
for (int i = 0; i < input.length(); i++) {
// take the first i characters of the input and see if it is a word
String substring = input.substring(0, i + 1);
if (dictionary.contains(substring)) {
// the beginning of the input matches a word, store on stack
words.push(substring);
if (i == input.length() - 1) {
// there's no input left, copy the words stack to results
results.add(new ArrayList<String>(words));
} else {
// there's more input left, search the remaining part
search(input.substring(i + 1), dictionary, words, results);
}
// pop the matched word back off so we can move onto the next i
words.pop();
}
}
}
Run Code Online (Sandbox Code Playgroud)
示例输出:
Enter a string: aman
a man (2 words)
am an (2 words)
Took 0ms
Run Code Online (Sandbox Code Playgroud)
这是一个更长的输入:
Enter a string: thequickbrownfoxjumpedoverthelazydog
the quick brown fox jump ed over the lazy dog (10 words)
the quick brown fox jump ed overt he lazy dog (10 words)
the quick brown fox jumped over the lazy dog (9 words)
the quick brown fox jumped overt he lazy dog (9 words)
Took 1ms
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
14703 次 |
最近记录: |