在sitecore中停用单词

rah*_*hul 8 lucene sitecore

我们使用Lucene进行文本搜索,作为sitecore的一部分.是否有任何方法可以忽略sitecore搜索中的停用词(如a,an,...)?

Yan*_*nko 14

默认情况下,Sitecore使用Lucene标准分析器 - Lucene.Net.Analysis.Standard.StandardAnalyzer.您可以看到这是在/configuration/sitecore/search/analyzerweb.config文件的元素中定义的.类的一个构造函数StandardAnalyzer接受它将考虑停用词的字符串数组.默认情况下,它使用硬编码的停用词列表,其中包括:

"a","an","and","are","as","at","be","but","by","for","if","in","into" ","是","它","不是","不是",",",","或",",",",",",",",",",",",", "那里","这些","他们","这个","来","是","将","带"

如果您想要覆盖此行为,我认为您应该继承StandardAnalyzer并覆盖其默认构造函数,以从另一个源而不是硬编码数组中获取停用词.您有各种选项,甚至可以从文本文件中读取它.不要忘记在web.config中用你的标准类替换标准类.

有关StandardAnalyzer详细信息,请参阅类的其他构造函数..NET Reflector是你的朋友.