Sha*_*kar 2 java lucene indexing
我正在尝试重用Document和Field实例来提高性能(我已经在文件中尝试了100万行,而没有重用它花费20秒的实例).
但是当我尝试这样做时,它需要花费太多时间并且它会继续运行.
之前有人能遇到同样的问题吗?
这是在尝试重用实例之前的现有代码,对于我正在创建新文档和字段的文件中的每一行.
FileInputStream fis;
try {
fis = new FileInputStream(file);
String filePath= file.getPath();
BufferedReader br = new BufferedReader(
new InputStreamReader(fis, StandardCharsets.UTF_8));
String line = null;
while ((line = br.readLine()) != null) {
String[] lineTokens = line.split("\\|");
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
writer.addDocument(doc);
}
br.close();
} catch (FileNotFoundException fnfe) {
}
Run Code Online (Sandbox Code Playgroud)
改变之后
FileInputStream fis;
try {
fis = new FileInputStream(file);
String filePath= file.getPath();
BufferedReader br = new BufferedReader(
new InputStreamReader(fis, StandardCharsets.UTF_8));
String line = null;
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
while ((line = br.readLine()) != null) {
//String[] lineTokens = line.split("\\|");
field1.setStringValue("field1Value");
doc.add(field1);
field2.setStringValue("field2Value");
doc.add(field2);
writer.addDocument(doc);
}
br.close();
} catch (FileNotFoundException fnfe) {
}
Run Code Online (Sandbox Code Playgroud)
您不需要在每次迭代时将字段添加到doc.添加字段一次后,您需要做的就是更改字段值,然后将更改的文档写入索引,如下所示:
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
while ((line = br.readLine()) != null) {
field1.setStringValue("field1Value");
field2.setStringValue("field2Value");
writer.addDocument(doc);
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
688 次 |
| 最近记录: |