我使用Lucene.net索引产品目录.我使用ANTS Profiler分析我的搜索,我注意到使用MultiFieldQueryParser创建和解析查询的行为几乎与实际搜索一样长(大约100ms).然后我尝试手动创建查询,这种情况发生得非常快(约1毫秒).我宁愿不必手动解析,虽然它确实给了我相同的结果集,我担心我可能不会处理某些用例或输入(虽然输入来自网站上的文本搜索,用户不会知道关于Lucene的搜索语法的任何内容).我的代码(使用两种方法)如下:
IApplicationSettings settings = new ApplicationSettingService();
FSDirectory directory = FSDirectory.Open(new DirectoryInfo(settings.GetSetting<string>("LuceneMainSearchDirectory")));
RAMDirectory ramDir = new RAMDirectory(directory);
_Searcher = new IndexSearcher(ramDir, true);
string[] searchFields = new string[] { "ProductName", "ProductLongDescription", "BrandName", "CategoryName" };
//Add a wildcard character to end of search to give broader results
if (!searchTerm.EndsWith(" ")) { searchTerm = searchTerm + "*"; }
//Use query parser...this block typically takes about 100ms on my machine, roughly 40% on the constructor and 60% on the call to Parse
MultiFieldQueryParser multiParser = new MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_29, searchFields, _analyzer);
multiParser.SetDefaultOperator(QueryParser.AND_OPERATOR);
Query query = multiParser.Parse(searchTerm);
//Manually create query....this block doesn't even take 1ms on my machine
BooleanQuery booleanQuery = new BooleanQuery(true);
var terms = searchTerm.Split(' ');
foreach (string s in terms)
{
BooleanQuery subQuery = new BooleanQuery(true);
if (!s.EndsWith("*"))
{
Query query1 = new TermQuery(new Term("ProductName", s));
Query query2 = new TermQuery(new Term("ProductLongDescription", s));
Query query3 = new TermQuery(new Term("BrandName", s));
Query query4 = new TermQuery(new Term("CategoryName", s));
subQuery.Add(query1, BooleanClause.Occur.SHOULD);
subQuery.Add(query2, BooleanClause.Occur.SHOULD);
subQuery.Add(query3, BooleanClause.Occur.SHOULD);
subQuery.Add(query4, BooleanClause.Occur.SHOULD);
}
else
{
Query query1 = new WildcardQuery(new Term("ProductName", s));
Query query2 = new WildcardQuery(new Term("ProductLongDescription", s));
Query query3 = new WildcardQuery(new Term("BrandName", s));
Query query4 = new WildcardQuery(new Term("CategoryName", s));
subQuery.Add(query1, BooleanClause.Occur.SHOULD);
subQuery.Add(query2, BooleanClause.Occur.SHOULD);
subQuery.Add(query3, BooleanClause.Occur.SHOULD);
subQuery.Add(query4, BooleanClause.Occur.SHOULD);
}
booleanQuery.Add(subQuery, BooleanClause.Occur.MUST);
}
//Run the search....results are the same for simple multiword text queries
var result2 = _Searcher.Search(booleanQuery, null, maxResults);
var result = _Searcher.Search(query, null, maxResults);
Run Code Online (Sandbox Code Playgroud)
使用手动查询构建来保存我的一个选项可能是共享MultiFieldQueryParser,但是我收集它的Parse方法不是线程安全的(虽然我只读了与java版本相关的...如果我错了,请纠正我假设).
我做错了什么或这只是野兽的本性?
在MultiFieldQueryParser简单地使用场景下的多个定期QueryParsers,它创建了每要查询对抗领域之一.
通常,创建QueryParser成本比仅仅Query手动创建成本更高.
它处理这里记录的复杂查询语法:Apache Lucene - Query Parser Syntax
它还将使用Analyzer您指定的搜索查询进行处理.如果使用Analyzerat索引时,则必须Analyzer在搜索代码中使用相同的/ logic.如果你不这样做,你最终会错过结果.
如果您使用空白分析器进行索引,那么手动构建BooleanQuery的代码就可以了.
| 归档时间: |
|
| 查看次数: |
2288 次 |
| 最近记录: |