弹性搜索 - 来自多个领域的不同元素

rit*_*ITW 4 elasticsearch

mongoDb使用弹性搜索创建了一个映射来索引我的集合。这是mapping属性:

"properties" : {
          "address_components" : {
            "properties" : {
              "_id" : {
                "type" : "string"
              },
              "subLocality1" : {
                "type" : "string",
                "index" : "not_analyzed"
              },
              "subLocality2" : {
                "type" : "string",
                "index" : "not_analyzed"
              },
              "subLocality3" : {
                "type" : "string",
                "index" : "not_analyzed"
              }, 
             "city" : {
                "type" : "string",
                "index" : "not_analyzed"
              }
            }
Run Code Online (Sandbox Code Playgroud)

现在,我想从这些字段中检索整体独特的项目:subLocality1, subLocality2, subLocality3, city。此外,每个distinct值都应包含q为一个子字符串。不同的项目还应包含相应的city值。

例子:

"address_components" : {
    "subLocality1" : "s1"
    "subLocality2" : "s1",
    "subLocality3" : "s2",
    "city":"a"
  }

"address_components" : {
    "subLocality1" : "s3"
    "subLocality2" : "s1",
    "subLocality3" : "s2",
    "city":"a"
  }

"address_components" : {
    "subLocality1" : "s2"
    "subLocality2" : "s1",
    "subLocality3" : "s4",
    "city":"a"
  }
Run Code Online (Sandbox Code Playgroud)

对于上述指标,预期结果为:

"address_components" : {
    "subLocality1" : "s1"
    "subLocality2" : "s1",
    "subLocality3" : "s2",
    "city":"ct1"
  }

"address_components" : {
    "subLocality1" : "s3"
    "subLocality2" : "s1",
    "subLocality3" : "s2",
    "city":"ct1"
  }

"address_components" : {
    "subLocality1" : "s2"
    "subLocality2" : "s1",
    "subLocality3" : "s4",
    "city":"ct1"
  }
{s1, a}, {s2,a}, {s3,a}, {s4,a},{a,a}
Run Code Online (Sandbox Code Playgroud)

我尝试使用弹性搜索terms聚合来做到这一点。

GET /rescu/rescu/_search?pretty=true&search_type=count

{
    "aggs" : {
        "distinct_locations" : {
            "terms" : {
                "script" : "doc['address_components.subLocality1'].value"
            }
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

terms根据以下链接,聚合仅适用于单个字段。

rit*_*ITW 7

在浏览了 Elastic Search api 文档后,我自己找到了答案。我们需要使用脚本从多个字段检索术语。

GET /rescu/rescu/_search?pretty=true&search_type=count
{
  "aggs": {
    "distinct_locations": {
      "terms": {
        "script": "[doc['address_components.subLocality1'].value,doc['address_components.subLocality2'].value,doc['address_components.subLocality3'].value]",
        "size": 5000
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)


Fua*_*ndi 5

这是包含两个字段的示例:国家、城市。它使用按国家划分的聚合和按城市划分的子聚合:

{
  "size": 0,
  "aggs": {
    "country": {
      "terms": {
        "field": "country"
      },
      "aggregations": {
        "city": {
          "terms": {
            "field": "city"
          }
        }
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

您可以使用多层子聚合。