从Lucene查询中获取匹配的术语

Bob*_*son 2 lucene

鉴于Lucene搜索查询如下:+(letter:A letter:B letter:C) +(style:Capital),如何判断三个字母中的哪一个实际匹配任何给定文档?我不在乎他们匹配的地方,或者他们匹配的次数,我只需要知道他们是否匹配.

目的是获取初始查询("AB C"),删除成功匹配的项(A和B),然后对余数(C)进行进一步处理.

L.B*_*L.B 10

虽然样本在c#中,但Lucene API非常相似(一些大小写差异).我认为翻译成java并不难.

这是用法

List<Term> terms = new List<Term>();    //will be filled with non-matched terms
List<Term> hitTerms = new List<Term>(); //will be filled with matched terms
GetHitTerms(query, searcher,docId, hitTerms,terms);
Run Code Online (Sandbox Code Playgroud)

这是方法

void GetHitTerms(Query query,IndexSearcher searcher,int docId,List<Term> hitTerms,List<Term>rest)
{
    if (query is TermQuery)
    {
        if (searcher.Explain(query, docId).IsMatch() == true) 
            hitTerms.Add((query as TermQuery).GetTerm());
        else
            rest.Add((query as TermQuery).GetTerm());
        return;
    }

    if (query is BooleanQuery)
    {
        BooleanClause[] clauses = (query as BooleanQuery).GetClauses();
        if (clauses == null) return;

        foreach (BooleanClause bc in clauses)
        {
            GetHitTerms(bc.GetQuery(), searcher, docId,hitTerms,rest);
        }
        return;
    }

    if (query is MultiTermQuery)
    {
        if (!(query is FuzzyQuery)) //FuzzQuery doesn't support SetRewriteMethod
            (query as MultiTermQuery).SetRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);

        GetHitTerms(query.Rewrite(searcher.GetIndexReader()), searcher, docId,hitTerms,rest);
    }
}
Run Code Online (Sandbox Code Playgroud)