我想删除java中的停用词.
所以,我从文本文件中读取了停用词.
并存储Set
Set<String> stopWords = new LinkedHashSet<String>();
BufferedReader br = new BufferedReader(new FileReader("stopwords.txt"));
String words = null;
while( (words = br.readLine()) != null) {
stopWords.add(words.trim());
}
br.close();
Run Code Online (Sandbox Code Playgroud)
而且,我读了另一个文本文件.
所以,我想删除在文本文件中复制字符串.
我怎么能够?
使用set for stopword:
Set<String> stopWords = new LinkedHashSet<String>();
BufferedReader SW= new BufferedReader(new FileReader("StopWord.txt"));
for(String line;(line = SW.readLine()) != null;)
stopWords.add(line.trim());
SW.close();
Run Code Online (Sandbox Code Playgroud)
和输入txt_file的ArrayList
BufferedReader br = new BufferedReader(new FileReader(txt_file.txt));
//make your arraylist here
// function deletStopWord() for remove all stopword in your "stopword.txt"
public ArrayList<String> deletStopWord(Set stopWords,ArrayList arraylist){
System.out.println(stopWords.contains("?"));
ArrayList<String> NewList = new ArrayList<String>();
int i=3;
while(i < arraylist.size() ){
if(!stopWords.contains(arraylist.get(i))){
NewList.add((String) arraylist.get(i));
}
i++;
}
System.out.println(NewList);
return NewList;
}
arraylist=deletStopWord(stopWords,arraylist);
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
11417 次 |
| 最近记录: |