按字段值分组文档

Question

按字段值分组文档

注意，这不是“如何获取不同值的计数”问题。我需要文档，而不是计数。

假设我有此映射：

country, color, height, weight

Run Code Online (Sandbox Code Playgroud)

我已将这些文件编入索引：

1. RU, red, 180, 90
2. BY, green, 170, 80
3. BY, blue, 180, 75
4. KZ, blue, 180, 95
5. KZ, red, 185, 100
6. KZ, red, 175, 80
7. KZ, red, 170, 80

Run Code Online (Sandbox Code Playgroud)

我想执行类似的查询groupby(country, color, doc_limit=2)，该查询将返回以下内容：

{
  "RU": {
    "red": [
      (doc 1. RU, red, 180, 90)
    ],
  },
  "BY": {
    "green": [
      (doc 2)
    ],
    "blue": [
      (doc 3)
    ]
  },
  "KZ": {
    "blue": [
      (doc 4)
    ],
    "red": [
      (doc 5),
      (doc 6)
    ]
  }
}

Run Code Online (Sandbox Code Playgroud)

每个存储桶中最多包含2个文档。

我该怎么做？

Answer 1

Val*_*Val 5

可以通过在字段上进行terms汇总，再在country字段上进行terms子汇总color，然后最终进行top_hits汇总，以使每个存储区获得2个匹配文档来实现

{
   "size": 0,
   "aggs": {
      "countries": {
         "terms": {
            "field": "country"
         },
         "aggs": {
            "colors": {
               "terms": {
                  "field": "color"
               },
               "aggs": {
                  "docs": {
                     "top_hits": {
                        "size": 2
                     }
                  }
               }
            }
         }
      }
   }
}

Run Code Online (Sandbox Code Playgroud)

@Val 如果您想要所有文档而不是前 2 个文档，您将如何更改查询？ (3认同)

归档时间：	10 年，2 月前
查看次数：	2294 次
最近记录：	10 年，2 月前