mos*_*ski 5 elasticsearch elasticsearch-aggregation elasticsearch-dsl-py
我想通过“数量”字段汇总一个文档列表(每个文档都有两个字段-时间戳记和数量),直到达到某个值为止。例如,我想获得按时间戳排序的文档列表,该文档的总数等于100。可以在一个查询中进行操作吗?
这是我的查询,返回总金额-我想在此处添加一个条件,以在达到一定值时停止聚合。
{
"query": {
"bool": {
"filter": [
{
"range": {
"timestamp": {
"gte": 1525168583
}
}
}
]
}
},
"aggs": {
"total_amount": {
"sum": {
"field": "amount"
}
}
},
"sort": [
"timestamp"
],
"size": 10000
}
Run Code Online (Sandbox Code Playgroud)
谢谢
完全有可能使用function_score 脚本的组合来模拟排序、范围 gte 查询的过滤器聚合和健康数量的scripted_metric 聚合来将总和限制在一定数量。
让我们首先设置一个映射并提取一些文档:
PUT summation
{
"mappings": {
"properties": {
"timestamp": {
"type": "date",
"format": "epoch_second"
}
}
}
}
POST summation/_doc
{
"context": "newest",
"timestamp": 1587049128,
"amount": 20
}
POST summation/_doc
{
"context": "2nd newest",
"timestamp": 1586049128,
"amount": 30
}
POST summation/_doc
{
"context": "3rd newest",
"timestamp": 1585049128,
"amount": 40
}
POST summation/_doc
{
"context": "4th newest",
"timestamp": 1585049128,
"amount": 30
}
Run Code Online (Sandbox Code Playgroud)
然后执行查询:
GET summation/_search
{
"size": 0,
"aggs": {
"filtered_agg": {
"filter": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"gte": 1585049128
}
}
},
{
"function_score": {
"query": {
"match_all": {}
},
"script_score": {
"script": {
"source": "return (params['now'] - doc['timestamp'].date.toMillis())",
"params": {
"now": 1587049676
}
}
}
}
}
]
}
},
"aggs": {
"limited_sum": {
"scripted_metric": {
"init_script": """
state['my_hash'] = new HashMap();
state['my_hash'].put('sum', 0);
state['my_hash'].put('docs', new ArrayList());
""",
"map_script": """
if (state['my_hash']['sum'] <= 100) {
state['my_hash']['sum'] += doc['amount'].value;
state['my_hash']['docs'].add(doc['context.keyword'].value);
}
""",
"combine_script": "return state['my_hash']",
"reduce_script": "return states[0]"
}
}
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
屈服
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"filtered_agg" : {
"meta" : { },
"doc_count" : 4,
"limited_sum" : {
"value" : {
"docs" : [
"newest",
"2nd newest",
"3rd newest",
"4th newest"
],
"sum" : 120
}
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
我在这里选择只返回doc.context
s,但您可以调整它以检索您喜欢的任何内容 - 无论是 ID、金额等。
归档时间: |
|
查看次数: |
55 次 |
最近记录: |