Tim*_*Tim 4 distinct filter nosql elasticsearch
我在elasticsearch中有一个大型文档存储,并希望检索不同的过滤器值以便在HTML下拉列表中显示.
一个例子就是这样的
[
{
"name": "John Doe",
"deparments": [
{
"name": "Accounts"
},
{
"name": "Management"
}
]
},
{
"name": "Jane Smith",
"deparments": [
{
"name": "IT"
},
{
"name": "Management"
}
]
}
]
下拉列表应包含一系列部门,即IT,帐户和管理.
有些人请指出我正确的方向从弹性搜索中检索一个独特的部门列表?
谢谢
这是terms聚合(文档)的工作.
你可以拥有这样的不同departments值:
POST company/employee/_search
{
"size":0,
"aggs": {
"by_departments": {
"terms": {
"field": "departments.name",
"size": 0 //see note 1
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
在您的示例中,输出:
{
...
"aggregations": {
"by_departments": {
"buckets": [
{
"key": "management", //see note 2
"doc_count": 2
},
{
"key": "accounts",
"doc_count": 1
},
{
"key": "it",
"doc_count": 1
}
]
}
}
}
Run Code Online (Sandbox Code Playgroud)
另外两个说明:
size为0会将最大存储桶数设置为Integer.MAX_VALUE.如果有太多departments不同的值,请不要使用它.terms分析departments值的结果.务必terms在映射为的字段上使用聚合not_analyzed.例如,使用我们的默认映射(departments.name是一个analyzed字符串),添加此员工:
{
"name": "Bill Gates",
"departments": [
{
"name": "IT"
},
{
"name": "Human Resource"
}
]
}
Run Code Online (Sandbox Code Playgroud)
会导致这种结果:
{
...
"aggregations": {
"by_departments": {
"buckets": [
{
"key": "it",
"doc_count": 2
},
{
"key": "management",
"doc_count": 2
},
{
"key": "accounts",
"doc_count": 1
},
{
"key": "human",
"doc_count": 1
},
{
"key": "resource",
"doc_count": 1
}
]
}
}
}
Run Code Online (Sandbox Code Playgroud)
使用正确的映射:
POST company
{
"mappings": {
"employee": {
"properties": {
"name": {
"type": "string"
},
"departments": {
"type": "object",
"properties": {
"name": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
相同的请求最终输出:
{
...
"aggregations": {
"by_departments": {
"buckets": [
{
"key": "IT",
"doc_count": 2
},
{
"key": "Management",
"doc_count": 2
},
{
"key": "Accounts",
"doc_count": 1
},
{
"key": "Human Resource",
"doc_count": 1
}
]
}
}
}
Run Code Online (Sandbox Code Playgroud)
希望这可以帮助!
| 归档时间: |
|
| 查看次数: |
4191 次 |
| 最近记录: |