你如何阅读Lucene的索引进行搜索？

Question

你如何阅读Lucene的索引进行搜索？

Lucene 4.3新手

如何在Lucene 4.3中进行简单的搜索？

我在一个简单的Java测试用例中修改了大纲:http: //lucene.apache.org/core/4_3_0/core/overview-summary.html#overview_description

该示例以:

DirectoryReader ireader = DirectoryReader.open(directory);
IndexSearcher isearcher = new IndexSearcher(ireader);

Run Code Online (Sandbox Code Playgroud)

但根据文档,DirectoryReader不可见(受保护).所以看起来好像你不能使用DirectoryReader.

所以我做了挖掘并尝试了各种排列以避免直接使用DirectoryReader,包括:

File indexdir = new File("D:\\lucenetest\\") ; // location of my index
Directory directory = FSDirectory.open(indexdir);

IndexReader ireader = IndexReader.open(FSDirectory.open(indexdir)); //ERROR NoSuchMethodError
//IndexReader ireader = IndexReader.open(directory); //variation ERROR NoSuchMethodError
IndexSearcher isearcher = new IndexSearcher(ireader);

Run Code Online (Sandbox Code Playgroud)

等等(包括尝试原子阅读器).似乎没什么用.(我确认Lucene Core已正确导入.)索引工作正常.

我查看了Lucene示例搜索代码以获取更多线索.http://lucene.apache.org/core/4_2_1/demo/src-html/org/apache/lucene/demo/SearchFiles.html

IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index))); //DirectoryReader not visible error
    IndexSearcher searcher = new IndexSearcher(reader);

Run Code Online (Sandbox Code Playgroud)

在简单的示例文件中使用时,这也不起作用.

我已经能够使简单的索引工作,以前能够使Lucene演示工作(索引和搜索).但是,我似乎无法进行简单的搜索工作.

有线索吗？

Answer 1

liz*_*zie 8

使用此示例代码可以执行非常简单的搜索

// directory where your index is stored
File path = new File(" ... /solr/solr/Collection1/data/index");

Directory index = FSDirectory.open(path);
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);

Term t = new Term("myfield", "myvalue");

// Get the top 10 docs
Query query = new TermQuery(t);
TopDocs tops= searcher.search(query, 10);
ScoreDoc[] scoreDoc = tops.scoreDocs;
System.out.println(scoreDoc.length); 
for (ScoreDoc score : scoreDoc){
    System.out.println("DOC " + score.doc + " SCORE " + score.score);
}

// Get the frequency of the term
int freq = reader.docFreq(t);
System.out.println("FREQ " + freq);
`

Run Code Online (Sandbox Code Playgroud)

Answer 2

fut*_*ics 3

我通常使用这段代码...它是一个类，封装了 LuceneIndex (v4) 的所有操作
\n它使用对索引的近实时访问，因此几乎所有更新都可供索引读取器使用：

\n\n

注意：它还使用lombok

\n\n

@Slf4j\npublic class LuceneIndex {\n/////////////////////////////////////////////////////////////////////////////////////////\n//  STATUS (ver http://blog.mikemccandless.com/2011/11/near-real-time-readers-with-lucenes.html)\n/////////////////////////////////////////////////////////////////////////////////////////\n    private final IndexWriter _indexWriter;\n    private final TrackingIndexWriter _trackingIndexWriter;\n    private final NRTManager _searchManager;\n\n    LuceneNRTReopenThread _reopenThread = null;\n    private long _reopenToken;  // index update/delete methods returned token\n\n/////////////////////////////////////////////////////////////////////////////////////////\n//  CONSTRUCTOR\n/////////////////////////////////////////////////////////////////////////////////////////\n    /**\n     * Constructor en base a una instancia del tipo responsable de la persistencia del \xc3\xadndice de lucene\n     */\n    public LuceneIndex(final Directory luceneDirectory,\n                       final Analyzer analyzer) {\n        try {\n            // Create the indexWriter\n            _indexWriter = new IndexWriter(luceneDirectory,\n                                           new IndexWriterConfig(LuceneConstants.VERSION,\n                                                                 analyzer));\n            _trackingIndexWriter = new NRTManager.TrackingIndexWriter(_indexWriter);\n            // Create the SearchManager to exec the search\n            _searchManager = new NRTManager(_trackingIndexWriter,\n                                            new SearcherFactory(),\n                                            true);\n\n            // Open the thread in charge of re-open the index to allow it to see real-time changes\n            //      The index is refreshed every 60sc when nobody is waiting \n            //      and every 100 millis whenever is someone waiting (see search method)\n            // (see http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/NRTManagerReopenThread.html)\n            _reopenThread = new LuceneNRTReopenThread(_searchManager,\n                                                      60.0,     // when there is nobody waiting\n                                                      0.1);     // when there is someone waiting\n            _reopenThread.startReopening();\n\n        } catch (IOException ioEx) {\n//          if (luceneDirectory instanceof JdbcDirectory) {\n//              throw new IllegalStateException("The BBDD table for the lucene index could not be created: " + ioEx.getMessage(),ioEx); \n//          } else {\n                throw new IllegalStateException("Lucene index could not be created: " + ioEx.getMessage());\n//          }\n        }\n    }\n/////////////////////////////////////////////////////////////////////////////////////////\n//  FINALIZADOR\n/////////////////////////////////////////////////////////////////////////////////////////\n    @Override\n    protected void finalize() throws Throwable {\n        this.close();\n        super.finalize();\n    }\n    /**\n     * Closes every index\n     */\n    public void close() {\n        try {\n            // stop the index reader re-open thread\n            _reopenThread.stopReopening();\n            _reopenThread.interrupt();\n\n            // Close the search manager\n            _searchManager.close();\n\n            // Close the indexWriter, commiting everithing that\'s pending\n            _indexWriter.commit();\n            _indexWriter.close();\n\n        } catch(IOException ioEx) {\n            log.error("Error while closing lucene index: {}",ioEx.getMessage(),\n                                                             ioEx);\n        }\n    }\n/////////////////////////////////////////////////////////////////////////////////////////\n//  REOPEN-THREAD: Thread in charge of re-open the IndexReader to have access to the \n//                 latest IndexWriter changes\n/////////////////////////////////////////////////////////////////////////////////////////\n    private class LuceneNRTReopenThread\n          extends NRTManagerReopenThread {\n\n        volatile boolean _finished = false;\n\n        public LuceneNRTReopenThread(final NRTManager manager,\n                                     final double targetMaxStaleSec,final double targetMinStaleSec) {\n            super(manager, targetMaxStaleSec, targetMinStaleSec);\n            this.setName("NRT Reopen Thread");\n            this.setPriority(Math.min(Thread.currentThread().getPriority()+2, \n                                      Thread.MAX_PRIORITY));\n            this.setDaemon(true);\n        }\n        public synchronized  void startReopening() {\n            _finished = false;\n            this.start();\n        }\n        public synchronized void stopReopening() {\n            _finished = true;\n        }\n        @Override\n        public void run() {\n            while (!_finished) {\n                super.run();\n            }\n        }\n    }\n/////////////////////////////////////////////////////////////////////////////////////////\n//  \n/////////////////////////////////////////////////////////////////////////////////////////\n    /**\n     * Index a Lucene document\n     * @param doc the document to be indexed\n     */\n    public void index(final Document doc) { \n        // Indexar en lucene\n        try {\n            _reopenToken = _trackingIndexWriter.addDocument(doc);\n            log.debug("document indexed in lucene");\n        } catch(IOException ioEx) {\n            log.error("Error while in Lucene index operation: {}",ioEx.getMessage(),\n                                                                  ioEx);\n        } finally {\n            try {\n                _indexWriter.commit();\n            } catch (IOException ioEx) {\n                log.error("Error while commiting changes to Lucene index: {}",ioEx.getMessage(),\n                                                                              ioEx);\n            }\n        }\n    }\n    /**\n     * Updates the index info for a lucene document\n     * @param doc the document to be indexed\n     */\n    public void reIndex(final Term recordIdTerm,\n                        final Document doc) {   \n        // Indexar en lucene\n        try {\n            _reopenToken = _trackingIndexWriter.updateDocument(recordIdTerm, \n                                                               doc);\n            log.debug("{} document re-indexed in lucene",recordIdTerm.text());\n        } catch(IOException ioEx) {\n            log.error("Error in lucene re-indexing operation: {}",ioEx.getMessage(),\n                                                                  ioEx);\n        } finally {\n            try {\n                _indexWriter.commit();\n            } catch (IOException ioEx) {\n                log.error("Error while commiting changes to Lucene index: {}",ioEx.getMessage(),\n                                                                              ioEx);\n            }\n        }\n    }\n    /**\n     * Unindex a lucene document\n     * @param idTerm term used to locate the document to be unindexed\n     *               IMPORTANT! the term must filter only the document and only the document\n     *                          otherwise all matching docs will be unindexed\n     */\n    public void unIndex(final Term idTerm) {\n        try {\n            _reopenToken = _trackingIndexWriter.deleteDocuments(idTerm);\n            log.debug("{}={} term matching records un-indexed from lucene",idTerm.field(),\n                                                                           idTerm.text());\n        } catch(IOException ioEx) {\n            log.error("Error in un-index lucene operation: {}",ioEx.getMessage(),\n                                                               ioEx);           \n        } finally {\n            try {\n                _indexWriter.commit(); \n            } catch (IOException ioEx) {\n                log.error("Error while commiting changes to Lucene index: {}",ioEx.getMessage(),\n                                                                              ioEx);\n            }\n        }\n    }\n    /**\n     * Delete all lucene index docs\n     */\n    public void truncate() {\n        try {\n            _reopenToken = _trackingIndexWriter.deleteAll();\n            log.warn("lucene index truncated");\n        } catch(IOException ioEx) {\n            log.error("Error truncating lucene index: {}",ioEx.getMessage(),\n                                                          ioEx);            \n        } finally {\n            try {\n                _indexWriter.commit(); \n            } catch (IOException ioEx) {\n                log.error("Error truncating lucene index: {}",ioEx.getMessage(),\n                                                              ioEx);\n            }\n        }\n    }\n/////////////////////////////////////////////////////////////////////////////////////////\n//  COUNT-SEARCH\n/////////////////////////////////////////////////////////////////////////////////////////\n    /**\n     * Count the number of results returned by a search against the lucene index\n     * @param qry the query\n     * @return\n     */\n    public long count(final Query qry) {\n        long outCount = 0;\n        try {\n            _searchManager.waitForGeneration(_reopenToken);     // wait untill the index is re-opened\n            IndexSearcher searcher = _searchManager.acquire();\n            try {\n                TopDocs docs = searcher.search(qry,0);\n                if (docs != null) outCount = docs.totalHits;\n                log.debug("count-search executed against lucene index returning {}",outCount);\n            } finally {\n                _searchManager.release(searcher);\n            }\n        } catch (IOException ioEx) {\n            log.error("Error re-opening the index {}",ioEx.getMessage(),\n                                                      ioEx);\n        }\n        return outCount;\n    }\n    /**\n     * Executes a search query\n     * @param qry the query to be executed\n     * @param sortFields the search query criteria\n     * @param firstResultItemOrder the order number of the first element to be returned\n     * @param numberOfResults number of results to be returnee\n     * @return a page of search results\n     */\n    public LucenePageResults search(final Query qry,Set<SortField> sortFields,\n                                    final int firstResultItemOrder,final int numberOfResults) {\n        LucenePageResults outDocs = null;\n        try {\n            _searchManager.waitForGeneration(_reopenToken); // wait until the index is re-opened for the last update\n            IndexSearcher searcher = _searchManager.acquire();\n            try {\n                // sort crieteria\n                SortField[] theSortFields = null;\n                if (CollectionUtils.hasData(sortFields)) theSortFields = CollectionUtils.toArray(sortFields,SortField.class);\n                Sort theSort = CollectionUtils.hasData(theSortFields) ? new Sort(theSortFields)\n                                                                      : null;\n                // number of results to be returned\n                int theNumberOfResults = firstResultItemOrder + numberOfResults;\n\n                // Exec the search (if the sort criteria is null, they\'re not used)\n                TopDocs scoredDocs = theSort != null ? searcher.search(qry,\n                                                                       theNumberOfResults,\n                                                                       theSort)\n                                                     : searcher.search(qry,\n                                                                       theNumberOfResults);\n                log.debug("query {} {} executed against lucene index: returned {} total items, {} in this page",qry.toString(),\n                                                                                                                (theSort != null ? theSort.toString() : ""),\n                                                                                                                scoredDocs != null ? scoredDocs.totalHits : 0,\n                                                                                                                scoredDocs != null ? scoredDocs.scoreDocs.length : 0);\n                outDocs = LucenePageResults.create(searcher,\n                                                   scoredDocs,\n                                                   firstResultItemOrder,numberOfResults);\n            } finally {\n                _searchManager.release(searcher);\n            }\n        } catch (IOException ioEx) {\n            log.error("Error freeing the searcher {}",ioEx.getMessage(),\n                                                      ioEx);\n        }\n        return outDocs;\n    }\n/////////////////////////////////////////////////////////////////////////////////////////\n//  INDEX MAINTEINANCE\n/////////////////////////////////////////////////////////////////////////////////////////\n    /**\n     * Mergest the lucene index segments into one\n     * (this should NOT be used, only rarely for index mainteinance)\n     */\n    public void optimize() {\n        try {\n            _indexWriter.forceMerge(1);\n            log.debug("Lucene index merged into one segment");\n        } catch (IOException ioEx) {\n            log.error("Error optimizing lucene index {}",ioEx.getMessage(),\n                                                         ioEx);\n        }\n    }\n}\n

Run Code Online (Sandbox Code Playgroud)\n

归档时间：	13 年前
查看次数：	17705 次
最近记录：	10 年，9 月前