Ana*_*nam 8 elasticsearch elasticsearch-aggregation
我正在研究一个 ElasticSearch (6.2) 项目,其中index有很多keyword字段,并且它们被标准化lowercase过滤器进行了执行不区分大小写的搜索。搜索工作良好并返回规范化字段的实际值(不是小写)。但是,聚合不返回字段的实际值(返回小写)。
以下示例取自 ElasticSearch 文档。
https://www.elastic.co/guide/en/elasticsearch/reference/master/normalizer.html
创建索引:
PUT 指数
{
“设置”:{
“分析”: {
“规范化”:{
“my_normalizer”:{
"类型": "自定义",
"char_filter": [],
“过滤器”:[“小写”,“asciifolding”]
}
}
}
},
“映射”:{
“_doc”:{
“特性”: {
“富”:{
“类型”:“关键字”,
"normalizer": "my_normalizer"
}
}
}
}
}
插入文档:
PUT 索引/_doc/1
{
"foo": "酒吧"
}
PUT 索引/_doc/2
{
"foo": "巴兹"
}
聚合搜索:
获取索引/_search
{
“大小”:0,
“aggs”:{
“foo_terms”:{
“条款”:{
“字段”:“foo”
}
}
}
}
结果:
{
“拿”:43,
“timed_out”:假,
“_shards”:{
“总”:1,
“成功”:1,
“跳过”:0,
“失败”:0
},
“命中”:{
“总计”:3,
"max_score": 0.0,
“命中”:{
“总计”:2,
“最大分数”:0.47000363,
“命中”:[
{
"_index": "索引",
"_type": "_doc",
"_id": "1",
“_score”:0.47000363,
“_来源”: {
"foo": "酒吧"
}
},
{
"_index": "索引",
"_type": "_doc",
"_id": "2",
“_score”:0.47000363,
“_来源”: {
"foo": "巴兹"
}
}
]
}
},
“聚合”:{
“foo_terms”:{
“doc_count_error_upper_bound”:0,
"sum_other_doc_count": 0,
“桶”:[
{
"key": "酒吧",
“文档计数”:1
},
{
"key": "baz",
“文档计数”:1
}
]
}
}
}
如果您检查聚合,您将看到已返回小写值。例如"key": "bar"。
有什么方法可以更改聚合以返回实际值?
例如 "key": "Bar"
如果您想进行不区分大小写的搜索但在聚合中返回精确值,则不需要任何规范化器。您可以简单地拥有一个带有子text字段的字段(它使标记小写并默认允许不区分大小写的搜索)keyword。前者用于搜索,后者用于聚合。它是这样的:
PUT index
{
"mappings": {
"_doc": {
"properties": {
"foo": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
索引您的两个文档后,您可以发布terms聚合foo.keyword:
GET index/_search
{
"size": 2,
"aggs": {
"foo_terms": {
"terms": {
"field": "foo.keyword"
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
结果如下所示:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "index",
"_type": "_doc",
"_id": "2",
"_score": 1,
"_source": {
"foo": "Baz"
}
},
{
"_index": "index",
"_type": "_doc",
"_id": "1",
"_score": 1,
"_source": {
"foo": "Bar"
}
}
]
},
"aggregations": {
"foo_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Bar",
"doc_count": 1
},
{
"key": "Baz",
"doc_count": 1
}
]
}
}
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1675 次 |
| 最近记录: |