mla*_*erg 12 highlighting elasticsearch
使用Elasticsearch的突出显示功能:
"highlight": {
"fields": {
"tags": { "number_of_fragments": 0 }
}
}
Run Code Online (Sandbox Code Playgroud)
使用时number_of_fragments: 0,不会生成任何片段,但会返回该字段的全部内容.这对于短文本很有用,因为文档可以正常显示,人们可以轻松扫描突出显示的部分.
当文档包含具有多个值的数组时,如何使用它?
PUT /test/doc/1
{
"tags": [
"one hit tag",
"two foo tag",
"three hit tag",
"four foo tag"
]
}
GET /test/doc/_search
{
"query": {
"match": { "tags": "hit"}
},
"highlight": {
"fields": {
"tags": { "number_of_fragments": 0 }
}
}
}
Run Code Online (Sandbox Code Playgroud)
现在我想向用户展示:
1结果:
文件1,标记为:
"one hit tag","two foo tag","three hit tag","four foo tag"
不幸的是,这是查询的结果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.10848885,
"hits": [
{
"_index": "test",
"_type": "doc",
"_id": "1",
"_score": 0.10848885,
"_source": {
"tags": [
"one hit tag",
"two foo tag",
"three hit tag",
"four foo tag"
]
},
"highlight": {
"tags": [
"one <em>hit</em> tag",
"three <em>hit</em> tag"
]
}
}
]
}
}
Run Code Online (Sandbox Code Playgroud)
我该如何使用它来:
"tags": [
"one <em>hit</em> tag",
"two foo tag",
"three <em>hit</em> tag",
"four foo tag"
]
Run Code Online (Sandbox Code Playgroud)
<em>一种可能性是从突出显示的字段中去除html 标签。然后在原始字段中查找它们:
tags = [
"one hit tag",
"two foo tag",
"three hit tag",
"four foo tag"
]
highlighted = [
"one <em>hit</em> tag",
"three <em>hit</em> tag",
]
highlighted.each do |highlighted_tag|
if (index = tags.index(highlighted_tag.gsub(/<\/?em>/, '')))
tags[index] = highlighted_tag
end
end
puts tags #=>
# one <em>hit</em> tag
# two foo tag
# three <em>hit</em> tag
# four foo tag
Run Code Online (Sandbox Code Playgroud)
这并没有为最漂亮的代码定价,但我认为它可以完成工作。
| 归档时间: |
|
| 查看次数: |
2067 次 |
| 最近记录: |