Lucene 4.3新手
如何在Lucene 4.3中进行简单的搜索?
我在一个简单的Java测试用例中修改了大纲:http: //lucene.apache.org/core/4_3_0/core/overview-summary.html#overview_description
该示例以:
DirectoryReader ireader = DirectoryReader.open(directory);
IndexSearcher isearcher = new IndexSearcher(ireader);
Run Code Online (Sandbox Code Playgroud)
但根据文档,DirectoryReader不可见(受保护).所以看起来好像你不能使用DirectoryReader.
所以我做了挖掘并尝试了各种排列以避免直接使用DirectoryReader,包括:
File indexdir = new File("D:\\lucenetest\\") ; // location of my index
Directory directory = FSDirectory.open(indexdir);
IndexReader ireader = IndexReader.open(FSDirectory.open(indexdir)); //ERROR NoSuchMethodError
//IndexReader ireader = IndexReader.open(directory); //variation ERROR NoSuchMethodError
IndexSearcher isearcher = new IndexSearcher(ireader);
Run Code Online (Sandbox Code Playgroud)
等等(包括尝试原子阅读器).似乎没什么用.(我确认Lucene Core已正确导入.)索引工作正常.
我查看了Lucene示例搜索代码以获取更多线索.http://lucene.apache.org/core/4_2_1/demo/src-html/org/apache/lucene/demo/SearchFiles.html
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index))); //DirectoryReader not visible error
IndexSearcher searcher = new IndexSearcher(reader);
Run Code Online (Sandbox Code Playgroud)
在简单的示例文件中使用时,这也不起作用.
我已经能够使简单的索引工作,以前能够使Lucene演示工作(索引和搜索).但是,我似乎无法进行简单的搜索工作.
有线索吗?
使用此示例代码可以执行非常简单的搜索
// directory where your index is stored
File path = new File(" ... /solr/solr/Collection1/data/index");
Directory index = FSDirectory.open(path);
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
Term t = new Term("myfield", "myvalue");
// Get the top 10 docs
Query query = new TermQuery(t);
TopDocs tops= searcher.search(query, 10);
ScoreDoc[] scoreDoc = tops.scoreDocs;
System.out.println(scoreDoc.length);
for (ScoreDoc score : scoreDoc){
System.out.println("DOC " + score.doc + " SCORE " + score.score);
}
// Get the frequency of the term
int freq = reader.docFreq(t);
System.out.println("FREQ " + freq);
`
Run Code Online (Sandbox Code Playgroud)
我通常使用这段代码...它是一个类,封装了 LuceneIndex (v4) 的所有操作
\n它使用对索引的近实时访问,因此几乎所有更新都可供索引读取器使用:
注意:它还使用lombok
\n\n@Slf4j\npublic class LuceneIndex {\n/////////////////////////////////////////////////////////////////////////////////////////\n// STATUS (ver http://blog.mikemccandless.com/2011/11/near-real-time-readers-with-lucenes.html)\n/////////////////////////////////////////////////////////////////////////////////////////\n private final IndexWriter _indexWriter;\n private final TrackingIndexWriter _trackingIndexWriter;\n private final NRTManager _searchManager;\n\n LuceneNRTReopenThread _reopenThread = null;\n private long _reopenToken; // index update/delete methods returned token\n\n/////////////////////////////////////////////////////////////////////////////////////////\n// CONSTRUCTOR\n/////////////////////////////////////////////////////////////////////////////////////////\n /**\n * Constructor en base a una instancia del tipo responsable de la persistencia del \xc3\xadndice de lucene\n */\n public LuceneIndex(final Directory luceneDirectory,\n final Analyzer analyzer) {\n try {\n // Create the indexWriter\n _indexWriter = new IndexWriter(luceneDirectory,\n new IndexWriterConfig(LuceneConstants.VERSION,\n analyzer));\n _trackingIndexWriter = new NRTManager.TrackingIndexWriter(_indexWriter);\n // Create the SearchManager to exec the search\n _searchManager = new NRTManager(_trackingIndexWriter,\n new SearcherFactory(),\n true);\n\n // Open the thread in charge of re-open the index to allow it to see real-time changes\n // The index is refreshed every 60sc when nobody is waiting \n // and every 100 millis whenever is someone waiting (see search method)\n // (see http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/NRTManagerReopenThread.html)\n _reopenThread = new LuceneNRTReopenThread(_searchManager,\n 60.0, // when there is nobody waiting\n 0.1); // when there is someone waiting\n _reopenThread.startReopening();\n\n } catch (IOException ioEx) {\n// if (luceneDirectory instanceof JdbcDirectory) {\n// throw new IllegalStateException("The BBDD table for the lucene index could not be created: " + ioEx.getMessage(),ioEx); \n// } else {\n throw new IllegalStateException("Lucene index could not be created: " + ioEx.getMessage());\n// }\n }\n }\n/////////////////////////////////////////////////////////////////////////////////////////\n// FINALIZADOR\n/////////////////////////////////////////////////////////////////////////////////////////\n @Override\n protected void finalize() throws Throwable {\n this.close();\n super.finalize();\n }\n /**\n * Closes every index\n */\n public void close() {\n try {\n // stop the index reader re-open thread\n _reopenThread.stopReopening();\n _reopenThread.interrupt();\n\n // Close the search manager\n _searchManager.close();\n\n // Close the indexWriter, commiting everithing that\'s pending\n _indexWriter.commit();\n _indexWriter.close();\n\n } catch(IOException ioEx) {\n log.error("Error while closing lucene index: {}",ioEx.getMessage(),\n ioEx);\n }\n }\n/////////////////////////////////////////////////////////////////////////////////////////\n// REOPEN-THREAD: Thread in charge of re-open the IndexReader to have access to the \n// latest IndexWriter changes\n/////////////////////////////////////////////////////////////////////////////////////////\n private class LuceneNRTReopenThread\n extends NRTManagerReopenThread {\n\n volatile boolean _finished = false;\n\n public LuceneNRTReopenThread(final NRTManager manager,\n final double targetMaxStaleSec,final double targetMinStaleSec) {\n super(manager, targetMaxStaleSec, targetMinStaleSec);\n this.setName("NRT Reopen Thread");\n this.setPriority(Math.min(Thread.currentThread().getPriority()+2, \n Thread.MAX_PRIORITY));\n this.setDaemon(true);\n }\n public synchronized void startReopening() {\n _finished = false;\n this.start();\n }\n public synchronized void stopReopening() {\n _finished = true;\n }\n @Override\n public void run() {\n while (!_finished) {\n super.run();\n }\n }\n }\n/////////////////////////////////////////////////////////////////////////////////////////\n// \n/////////////////////////////////////////////////////////////////////////////////////////\n /**\n * Index a Lucene document\n * @param doc the document to be indexed\n */\n public void index(final Document doc) { \n // Indexar en lucene\n try {\n _reopenToken = _trackingIndexWriter.addDocument(doc);\n log.debug("document indexed in lucene");\n } catch(IOException ioEx) {\n log.error("Error while in Lucene index operation: {}",ioEx.getMessage(),\n ioEx);\n } finally {\n try {\n _indexWriter.commit();\n } catch (IOException ioEx) {\n log.error("Error while commiting changes to Lucene index: {}",ioEx.getMessage(),\n ioEx);\n }\n }\n }\n /**\n * Updates the index info for a lucene document\n * @param doc the document to be indexed\n */\n public void reIndex(final Term recordIdTerm,\n final Document doc) { \n // Indexar en lucene\n try {\n _reopenToken = _trackingIndexWriter.updateDocument(recordIdTerm, \n doc);\n log.debug("{} document re-indexed in lucene",recordIdTerm.text());\n } catch(IOException ioEx) {\n log.error("Error in lucene re-indexing operation: {}",ioEx.getMessage(),\n ioEx);\n } finally {\n try {\n _indexWriter.commit();\n } catch (IOException ioEx) {\n log.error("Error while commiting changes to Lucene index: {}",ioEx.getMessage(),\n ioEx);\n }\n }\n }\n /**\n * Unindex a lucene document\n * @param idTerm term used to locate the document to be unindexed\n * IMPORTANT! the term must filter only the document and only the document\n * otherwise all matching docs will be unindexed\n */\n public void unIndex(final Term idTerm) {\n try {\n _reopenToken = _trackingIndexWriter.deleteDocuments(idTerm);\n log.debug("{}={} term matching records un-indexed from lucene",idTerm.field(),\n idTerm.text());\n } catch(IOException ioEx) {\n log.error("Error in un-index lucene operation: {}",ioEx.getMessage(),\n ioEx); \n } finally {\n try {\n _indexWriter.commit(); \n } catch (IOException ioEx) {\n log.error("Error while commiting changes to Lucene index: {}",ioEx.getMessage(),\n ioEx);\n }\n }\n }\n /**\n * Delete all lucene index docs\n */\n public void truncate() {\n try {\n _reopenToken = _trackingIndexWriter.deleteAll();\n log.warn("lucene index truncated");\n } catch(IOException ioEx) {\n log.error("Error truncating lucene index: {}",ioEx.getMessage(),\n ioEx); \n } finally {\n try {\n _indexWriter.commit(); \n } catch (IOException ioEx) {\n log.error("Error truncating lucene index: {}",ioEx.getMessage(),\n ioEx);\n }\n }\n }\n/////////////////////////////////////////////////////////////////////////////////////////\n// COUNT-SEARCH\n/////////////////////////////////////////////////////////////////////////////////////////\n /**\n * Count the number of results returned by a search against the lucene index\n * @param qry the query\n * @return\n */\n public long count(final Query qry) {\n long outCount = 0;\n try {\n _searchManager.waitForGeneration(_reopenToken); // wait untill the index is re-opened\n IndexSearcher searcher = _searchManager.acquire();\n try {\n TopDocs docs = searcher.search(qry,0);\n if (docs != null) outCount = docs.totalHits;\n log.debug("count-search executed against lucene index returning {}",outCount);\n } finally {\n _searchManager.release(searcher);\n }\n } catch (IOException ioEx) {\n log.error("Error re-opening the index {}",ioEx.getMessage(),\n ioEx);\n }\n return outCount;\n }\n /**\n * Executes a search query\n * @param qry the query to be executed\n * @param sortFields the search query criteria\n * @param firstResultItemOrder the order number of the first element to be returned\n * @param numberOfResults number of results to be returnee\n * @return a page of search results\n */\n public LucenePageResults search(final Query qry,Set<SortField> sortFields,\n final int firstResultItemOrder,final int numberOfResults) {\n LucenePageResults outDocs = null;\n try {\n _searchManager.waitForGeneration(_reopenToken); // wait until the index is re-opened for the last update\n IndexSearcher searcher = _searchManager.acquire();\n try {\n // sort crieteria\n SortField[] theSortFields = null;\n if (CollectionUtils.hasData(sortFields)) theSortFields = CollectionUtils.toArray(sortFields,SortField.class);\n Sort theSort = CollectionUtils.hasData(theSortFields) ? new Sort(theSortFields)\n : null;\n // number of results to be returned\n int theNumberOfResults = firstResultItemOrder + numberOfResults;\n\n // Exec the search (if the sort criteria is null, they\'re not used)\n TopDocs scoredDocs = theSort != null ? searcher.search(qry,\n theNumberOfResults,\n theSort)\n : searcher.search(qry,\n theNumberOfResults);\n log.debug("query {} {} executed against lucene index: returned {} total items, {} in this page",qry.toString(),\n (theSort != null ? theSort.toString() : ""),\n scoredDocs != null ? scoredDocs.totalHits : 0,\n scoredDocs != null ? scoredDocs.scoreDocs.length : 0);\n outDocs = LucenePageResults.create(searcher,\n scoredDocs,\n firstResultItemOrder,numberOfResults);\n } finally {\n _searchManager.release(searcher);\n }\n } catch (IOException ioEx) {\n log.error("Error freeing the searcher {}",ioEx.getMessage(),\n ioEx);\n }\n return outDocs;\n }\n/////////////////////////////////////////////////////////////////////////////////////////\n// INDEX MAINTEINANCE\n/////////////////////////////////////////////////////////////////////////////////////////\n /**\n * Mergest the lucene index segments into one\n * (this should NOT be used, only rarely for index mainteinance)\n */\n public void optimize() {\n try {\n _indexWriter.forceMerge(1);\n log.debug("Lucene index merged into one segment");\n } catch (IOException ioEx) {\n log.error("Error optimizing lucene index {}",ioEx.getMessage(),\n ioEx);\n }\n }\n}\nRun Code Online (Sandbox Code Playgroud)\n
| 归档时间: |
|
| 查看次数: |
17705 次 |
| 最近记录: |