汇总点击量最高的ElasticSearch

Question

汇总点击量最高的ElasticSearch

我的文档结构如下：

{
   "chefInfo": {
      "id": int,
      "employed": String
      ... Some more recipe information ...
   }
   "recipe": {
      ... Some recipe information ...
   }
}

Run Code Online (Sandbox Code Playgroud)

如果厨师有多个食谱，则chefInfo每个文档中的嵌套块将相同。我的问题是我想对chefInfo文档部分中的字段进行汇总。但是，这并未考虑到该chefInfo块是重复块这一事实。

因此，如果ID为1的厨师有5种食谱，而我正在employed现场进行汇总，则这位特定的厨师将代表汇总中的5个计数，而我希望他们只对一个计数。

我考虑过top_hits对Chef_id 进行汇总，然后想对所有存储桶进行子聚合，但是我无法弄清楚如何对所有存储桶的结果进行计数。

我可能想做什么？

Answer 1

Nis*_*ini 5

For elastic every document in itself is unique. In your case you want to define uniqueness based on a different field, here chefInfo.id. To find unique count based on this field you have to make use of cardinality aggregation.

You can apply the aggregation as below:

{
  "aggs": {
    "employed": {
      "nested": {
        "path": "chefInfo"
      },
      "aggs": {
        "employed": {
          "terms": {
            "field": "chefInfo.employed.keyword"
          },
          "aggs": {
            "employed_unique": {
              "cardinality": {
                "field": "chefInfo.id"
              }
            }
          }
        }
      }
    }
  }
}

Run Code Online (Sandbox Code Playgroud)

In the result employed_unique give you the expected count.

还应知道，如果`chefInfo.id`具有较高的基数，则[计数可能是近似值]（ Aggregation.html＃_counts_are_approximate） (2认同)

归档时间：	6 年，11 月前
查看次数：	124 次
最近记录：	6 年，11 月前