鉴于Lucene搜索查询如下:+(letter:A letter:B letter:C) +(style:Capital),如何判断三个字母中的哪一个实际匹配任何给定文档?我不在乎他们匹配的地方,或者他们匹配的次数,我只需要知道他们是否匹配.
目的是获取初始查询("AB C"),删除成功匹配的项(A和B),然后对余数(C)进行进一步处理.
L.B*_*L.B 10
虽然样本在c#中,但Lucene API非常相似(一些大小写差异).我认为翻译成java并不难.
这是用法
List<Term> terms = new List<Term>(); //will be filled with non-matched terms
List<Term> hitTerms = new List<Term>(); //will be filled with matched terms
GetHitTerms(query, searcher,docId, hitTerms,terms);
Run Code Online (Sandbox Code Playgroud)
这是方法
void GetHitTerms(Query query,IndexSearcher searcher,int docId,List<Term> hitTerms,List<Term>rest)
{
if (query is TermQuery)
{
if (searcher.Explain(query, docId).IsMatch() == true)
hitTerms.Add((query as TermQuery).GetTerm());
else
rest.Add((query as TermQuery).GetTerm());
return;
}
if (query is BooleanQuery)
{
BooleanClause[] clauses = (query as BooleanQuery).GetClauses();
if (clauses == null) return;
foreach (BooleanClause bc in clauses)
{
GetHitTerms(bc.GetQuery(), searcher, docId,hitTerms,rest);
}
return;
}
if (query is MultiTermQuery)
{
if (!(query is FuzzyQuery)) //FuzzQuery doesn't support SetRewriteMethod
(query as MultiTermQuery).SetRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);
GetHitTerms(query.Rewrite(searcher.GetIndexReader()), searcher, docId,hitTerms,rest);
}
}
Run Code Online (Sandbox Code Playgroud)