索尔建议重复建议

kku*_*urt 5 solr autosuggest search-suggestion

我试图使用Solr(5)的建议.建议有效,但我得到反复的建议.我试图在建议上使用分组,但它不起作用.我该如何防止反复出现的建议?

这是我的schema.xml的必要部分:

<field name="Name" type="suggest" indexed="true" stored="true" multiValued="false"/>  
...
<fieldType name="suggest" class="solr.TextField">
  <analyzer type="index">        
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>             
        <filter class="solr.LowerCaseFilterFactory"/>           
        <filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="15"/>              
  </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>      
        <filter class="solr.LowerCaseFilterFactory"/>           
      </analyzer>
</fieldType>
Run Code Online (Sandbox Code Playgroud)

我的solrconfig.xml:

<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
  <str name="name">mySuggester</str>    
  <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
  <str name="suggestAnalyzerFieldType">suggest</str>      
  <str name="exactMatchFirst">true</str>
  <str name="dictionaryImpl">DocumentDictionaryFactory</str>      
  <str name="field">Name</str>
  <str name="weightField">Price</str>      
  <str name="buildOnCommit">true</str>        
  <str name="buildOnStartup">false</str>
  <str name="preserveSep">false</str>    
</lst>  
Run Code Online (Sandbox Code Playgroud)

<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">   
  <str name="suggest">true</str>
  <str name="suggest.count">5</str>
  <str name="suggest.dictionary">mySuggester</str>
  <str name="suggest.collate">true</str>     
</lst>
<arr name="components">
  <str>suggest</str>
  <str>query</str>    
</arr>
Run Code Online (Sandbox Code Playgroud)

使用params的"Acer"建议的示例输出

/suggest?&suggest.dictionary=mySuggester&suggest.q=acer

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">6</int>
</lst>
<lst name="suggest">
<lst name="mySuggester">
<lst name="acer">
<int name="numFound">5</int>
<arr name="suggestions">
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2369</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2369</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2350</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-542081TMamm Intel Core i5 4200M 2.5GHz / 3.1GHz 8GB 1TB 17.3"
</str>
<long name="weight">2099</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-542081TMamm Intel Core i5 4200M 2.5GHz / 3.1GHz 8GB 1TB 17.3"
</str>
<long name="weight">2000</long>
<str name="payload"/>
</lst>
</arr>
</lst>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>
Run Code Online (Sandbox Code Playgroud)

你可以看到建议宏碁V3-772G-5421121TMAKK英特尔酷睿i5 4210U 1.7GHz 12GB 1TB 17.3 "三倍.

分组也不起作用:

建议?&suggest.dictionary = mySuggester&suggest.q =宏碁&组=真group.field =名称

 <response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">90</int>
</lst>
<lst name="suggest">
<lst name="mySuggester">
<lst name="acer">
<int name="numFound">5</int>
<arr name="suggestions">
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2369</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2369</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2350</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-542081TMamm Intel Core i5 4200M 2.5GHz / 3.1GHz 8GB 1TB 17.3"
</str>
<long name="weight">2099</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-542081TMamm Intel Core i5 4200M 2.5GHz / 3.1GHz 8GB 1TB 17.3"
</str>
<long name="weight">2000</long>
<str name="payload"/>
</lst>
</arr>
</lst>
</lst>
</lst>
<lst name="grouped">
<lst name="Name">
<int name="matches">0</int>
<arr name="groups"/>
</lst>
</lst>
</response>
Run Code Online (Sandbox Code Playgroud)

Yas*_*ana 4

您正在使用DocumentDictionaryFactory字典实现。它将存储针对每个文档的建议术语。因此,如果多个文档中存在相同的建议术语,则将提供所有这些实例。

为了防止这种情况发生,您可以

  1. 编写一个拦截 API,从 Solr 读取建议(例如:一次 30 个),然后在返回数据之前对它们进行重复数据删除
  2. 使用其他字典,例如FileDictionaryFactoryHighFrequencyDictionaryFactory