Solr SuggestComponent能够返回带状疱疹而不是整个字段值吗?

Ste*_*fan 4 solr autocomplete autosuggest

我使用solr 5.0.0并希望创建一个自动完成功能,从我的文档的word-gram(或带状疱疹)生成建议.问题是,作为建议查询的回报,我只得到搜索字段的完整"术语",这可能是极长的.

当前问题:

输入:"so"建议:"......极长的文字,所以 n长篇文章继续......"

"......下一篇长篇文章如此下一篇文章继续......"

目标:

输入:"so"

带状疱疹的建议:

" 所以 n"

" so lar"

" 所以测试"

等等

<searchComponent name="suggest" class="solr.SuggestComponent" 
               enable="${solr.suggester.enabled:true}"     >
<lst name="suggester">
  <str name="name">mySuggester</str>
  <str name="lookupImpl">AnalyzingInfixLookupFactory</str>      
  <str name="dictionaryImpl">DocumentDictionaryFactory</str>
  <str name="field">title_and_description_suggest</str>
  <str name="weightField">price</str>
  <str name="suggestAnalyzerFieldType">autocomplete</str>
  <str name="queryAnalyzerFieldType">autocomplete</str>
 <str name="buildOnCommit">true</str>
</lst>
Run Code Online (Sandbox Code Playgroud)

schema.xml中:

<fieldType name="autocomplete" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball"/>
      <filter class="solr.ShingleFilterFactory" maxShingleSize="2" outputUnigrams="true" outputUnigramsIfNoShingles="true"/>
      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    </analyzer>
</fieldType>
Run Code Online (Sandbox Code Playgroud)

我想返回最多3个单词作为自动完成术语.这是可能的SuggestComponent或你会怎么做?无论我尝试什么,我总能收到匹配文档的完整字段值.

这是预期的行为还是我做错了什么?

提前谢谢了

小智 7

在schema.xml中定义fieldType,如下所示:

 <fieldType name="text_autocomplete" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.ShingleFilterFactory" minShingleSize="2" maxShingleSize="5"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
    </fieldType>
Run Code Online (Sandbox Code Playgroud)

在schema.xml中,按如下方式定义字段:

<field name="example_field" type="text_autocomplete" indexed="true" stored="true"/>
Run Code Online (Sandbox Code Playgroud)

编写您的查询如下:

query?q=*&
rows=0&
facet=true&
facet.field=example_field&
facet.limit=-1&
wt=json&
indent=true&
facet.prefix=so
Run Code Online (Sandbox Code Playgroud)

在facet.prefix字段中,指定要为其建议的搜索词(在此示例中为"so").如果建议中少于5个单词,请相应地减少fieldType定义中的maxShingleSize.默认情况下,您将按其出现频率的降序获得结果.