搜索文本并请求结果查询高亮时,如果匹配的文档字段包含感叹号,则返回的高亮文本不包含包含感叹号的文本部分
Elasticsearch 版本 7.1.1
文档:{ "name" : "Yahoo! Inc [Please refer to Altaba Inc and Verizon Communications Inc]"}
突出显示“inc”通配符搜索
预期: 突出显示的文本应该是:
"Yahoo! <em>Inc</em> [Please refer to Altaba <em>Inc</em> and Verizon Communications <em>Inc</em>]"
Run Code Online (Sandbox Code Playgroud)
实际: “雅虎!” 响应中缺失。得到:
"<em>Inc</em> [Please refer to Altaba <em>Inc</em> and Verizon Communications <em>Inc</em>]"
Run Code Online (Sandbox Code Playgroud)
我认为这与!标记。如果我删除它,那么一切都OK。
重现步骤:
将文档添加到新索引
POST test/_doc/ { "name" : "Yahoo! Inc [Please refer to Altaba Inc and Verizon Communications Inc]" }
Run Code Online (Sandbox Code Playgroud)
没有其他设置/映射
运行查询
GET test/_search { "query": { "bool": { "should": [ { "wildcard": { "name": { "wildcard": "inc*" } } } ] } }, "highlight": { "fields": { "name" : {} } } }
Run Code Online (Sandbox Code Playgroud)
得到以下结果:
"hits" : [ { "_index" : "test", "_type" : "_doc", "_id" : "511tP3ABoqekxkoUshVf", "_score" : 1.0, "_source" : { "name" : "Yahoo! Inc [Please refer to Altaba Inc and Verizon Communications Inc]" }, "highlight" : { "name" : [ "<em>Inc</em> [Please refer to Altaba <em>Inc</em> and Verizon Communications <em>Inc</em>]" ] } } ]
Run Code Online (Sandbox Code Playgroud)
期待亮点:
"Yahoo! <em>Inc</em> [Please refer to Altaba <em>Inc</em> and Verizon Communications <em>Inc</em>]"
Run Code Online (Sandbox Code Playgroud)
这是预期的行为,因为默认情况下,Elasticsearch 突出显示会返回搜索文本(片段)的一部分,请参阅:https://www.elastic.co/guide/en/elasticsearch/reference/7.1/search-request-highlighting。 html#unified-highlighter
!和 。被视为前一句的结尾,并且突出显示不会返回该片段。
就我而言,搜索的文本代表一个文本长度较小的名称,通过添加,"number_of_fragments" : 0我强制突出显示返回整个文档字段。
"highlight": {
"fields": {
"name" : {"number_of_fragments" : 0}
}
}
Run Code Online (Sandbox Code Playgroud)
与: https: //github.com/elastic/elasticsearch/issues/52333相同
| 归档时间: |
|
| 查看次数: |
957 次 |
| 最近记录: |