Elasticsearch 所有记录中数组中出现频率最高的 10 个值

Akh*_*dia 6 elasticsearch

我有一个索引“测试”。文档结构如下图所示。每个文档都有一个“标签”数组。我不知道如何查询此索引来获取前 10 个最常出现的标签?

\n\n

另外,如果该索引中有超过 200 万篇文档,那么应该遵循哪些最佳实践?

\n\n
{\n    "_index" : "test",\n    "_type" : "data",\n    "_id" : "1412879673545024927_1373991666",\n    "_score" : 1.0,\n    "_source" : {\n      "instagramuserid" : "1373991666",\n      "likes_count" : 163,\n      "@timestamp" : "2017-06-08T08:52:41.803Z",\n      "post" : {\n        "created_time" : "1482648403",\n        "comments" : {\n          "count" : 9\n        },\n        "user_has_liked" : true,\n        "link" : "https://www.instagram.com/p/BObjpPMBWWf/",\n        "caption" : {\n          "created_time" : "1482648403",\n          "from" : {\n            "full_name" : "PARAMSahib \xe2\x84\xa2",\n            "profile_picture" : "https://scontent.cdninstagram.com/t51.2885-19/s150x150/12750236_1692144537739696_350427084_a.jpg",\n            "id" : "1373991666",\n            "username" : "parambanana"\n          },\n          "id" : "17845953787172829",\n          "text" : "This feature talks about how to work pastels .\\n\\nDull gold pullover + saffron khadi kurta + baby pink pants + Deep purple patka and white sneakers - Perfect colours for a Happy sunday christmas morning . \\n#paramsahib #men #menswear #mensfashion #mensfashionblog #mensfashionblogger #menswearofficial #menstyle #fashion #fashionfashion #fashionblog #blog #blogger #designer #fashiondesigner #streetstyle #streetfashion #sikh #sikhfashion #singhstreetstyle #sikhdesigner #bearded #indian #indianfashionblog #indiandesigner #international #ootd #lookbook #delhistyleblog #delhifashionblog"\n        },\n        "type" : "image",\n        "tags" : [\n          "men",\n          "delhifashionblog",\n          "menswearofficial",\n          "fashiondesigner",\n          "singhstreetstyle",\n          "fashionblog",\n          "mensfashion",\n          "fashion",\n          "sikhfashion",\n          "delhistyleblog",\n          "sikhdesigner",\n          "indianfashionblog",\n          "lookbook",\n          "fashionfashion",\n          "designer",\n          "streetfashion",\n          "international",\n          "paramsahib",\n          "mensfashionblogger",\n          "indian",\n          "blog",\n          "mensfashionblog",\n          "menstyle",\n          "ootd",\n          "indiandesigner",\n          "menswear",\n          "blogger",\n          "sikh",\n          "streetstyle",\n          "bearded"\n        ],\n        "filter" : "Normal",\n        "attribution" : null,\n        "location" : null,\n        "id" : "1412879673545024927_1373991666",\n        "likes" : {\n          "count" : 163\n        }\n      }\n    }\n  },\n
Run Code Online (Sandbox Code Playgroud)\n

Moh*_*aeh 6

如果映射中的标签类型是object(默认情况下),您可以使用如下聚合查询:

{
   "size": 0,
   "aggs": {
      "frequent_tags": {
         "terms": {"field": "post.tags"}
      }
   }
}
Run Code Online (Sandbox Code Playgroud)