如何使用lucene.net实现多个过滤器的搜索

MSR*_*SRS 7 lucene.net

我是lucene.net的新手.我想在客户端数据库上实现搜索功能.我有以下场景:

  • 用户将根据当前选定的城市搜索客户.
  • 如果用户想要搜索另一个城市的客户,则他必须更改城市并再次执行搜索.
  • 为了优化搜索结果,我们需要在区域(多个),Pincode等上提供过滤器.换句话说,我需要对以下sql查询进行等效的lucene查询:

    SELECT * FROM CLIENTS
         WHERE CITY = N'City1'
         AND (Area like N'%area1%' OR Area like N'%area2%')
    
    Run Code Online (Sandbox Code Playgroud)
    SELECT * FROM CILENTS
        WHERE CITY IN ('MUMBAI', 'DELHI')
        AND CLIENTTYPE IN ('GOLD', 'SILVER')
    
    Run Code Online (Sandbox Code Playgroud)

下面是我实现的用于提供城市搜索作为过滤器的代码:

private static IEnumerable<ClientSearchIndexItemDto> _search(string searchQuery, string city, string searchField = "")
{
    // validation
    if (string.IsNullOrEmpty(searchQuery.Replace("*", "").Replace("?", "")))
        return new List<ClientSearchIndexItemDto>();

    // set up Lucene searcher
    using (var searcher = new IndexSearcher(_directory, false))
    {
        var hits_limit = 1000;
        var analyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30);

        // search by single field
        if (!string.IsNullOrEmpty(searchField))
        {
            var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_30, searchField, analyzer);
            var query = parseQuery(searchQuery, parser);
            var hits = searcher.Search(query, hits_limit).ScoreDocs;
            var results = _mapLuceneToDataList(hits, searcher);
            analyzer.Close();
            searcher.Dispose();
            return results;
        }
        else // search by multiple fields (ordered by RELEVANCE)
        {
            var parser = new MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_30, new[]
            {
                "ClientId",
                "ClientName",
                "ClientTypeNames",
                "CountryName",
                "StateName",
                "DistrictName",
                "City",
                "Area",
                "Street",
                "Pincode",
                "ContactNumber",
                "DateModified"
            }, analyzer);
            var query = parseQuery(searchQuery, parser);
            var f = new FieldCacheTermsFilter("City",new[] { city });
            var hits = searcher.Search(query, f, hits_limit, Sort.RELEVANCE).ScoreDocs;
            var results = _mapLuceneToDataList(hits, searcher);
            analyzer.Close();
            searcher.Dispose();
            return results;
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

现在我必须在Area,Pincode等上提供更多过滤器,其中Area是多个.我尝试过如下的BooleanQuery:

var cityFilter = new TermQuery(new Term("City", city));
var areasFilter = new FieldCacheTermsFilter("Area",areas); -- where type of areas is string[]

BooleanQuery filterQuery = new BooleanQuery();
filterQuery.Add(cityFilter, Occur.MUST);
filterQuery.Add(areasFilter, Occur.MUST); -- here filterQuery.Add not have an overloaded method which accepts string[]
Run Code Online (Sandbox Code Playgroud)

如果我们用单个区域执行相同的操作,那么它工作正常.

我已经尝试过像下面的ChainedFilter,它似乎不满足要求.以下代码对城市和地区执行或操作.但要求是在给定城市提供的区域之间执行OR操作.

var f = new ChainedFilter(new Filter[] { cityFilter, areasFilter });
Run Code Online (Sandbox Code Playgroud)

任何人都可以向我建议如何在lucene.net中实现这一目标?我们将不胜感激.

sis*_*sve 14

你正在寻找BooleanFilter.几乎任何查询对象都有匹配的过滤器对象.

考虑TermsFilter(从Lucene.Net.Contrib .Queries)如果你的索引不匹配的要求FieldCacheTermsFilter.从后来的文件; "此过滤器要求该字段仅包含所有文档的单个术语".

var cityFilter = new FieldCacheTermsFilter("CITY", new[] {"MUMBAI", "DELHI"});
var clientTypeFilter = new FieldCacheTermsFilter("CLIENTTYPE", new [] { "GOLD", "SILVER" });

var areaFilter = new TermsFilter();
areaFilter.AddTerm(new Term("Area", "area1"));
areaFilter.AddTerm(new Term("Area", "area2"));

var filter = new BooleanFilter();
filter.Add(new FilterClause(cityFilter, Occur.MUST));
filter.Add(new FilterClause(clientTypeFilter, Occur.MUST));
filter.Add(new FilterClause(areaFilter, Occur.MUST));

IndexSearcher searcher = null; // TODO.
Query query = null; // TODO.
Int32 hits_limit = 0; // TODO.
var hits = searcher.Search(query, filter, hits_limit, Sort.RELEVANCE).ScoreDocs;
Run Code Online (Sandbox Code Playgroud)