Elasticsearch基于满足过滤器的数组中的元素进行排序

Question

Elasticsearch基于满足过滤器的数组中的元素进行排序

Alb*_*ill 5 elasticsearch elasticsearch-2.0

我的类型有一个字段,它是ISO 8601格式的一组时间.我希望得到所有在某一天都有时间的商品,然后在特定日期的最早时间订购.问题是我的查询是根据所有日子的最早时间进行排序.

您可以重现以下问题.

curl -XPUT 'localhost:9200/listings?pretty'

curl -XPOST 'localhost:9200/listings/listing/_bulk?pretty' -d '
{"index": { } }
{ "name": "second on 6th (3rd on the 5th)", "times": ["2018-12-05T12:00:00","2018-12-06T11:00:00"] }
{"index": { } }
{ "name": "third on 6th (1st on the 5th)", "times": ["2018-12-05T10:00:00","2018-12-06T12:00:00"] }
{"index": { } }
{ "name": "first on the 6th (2nd on the 5th)", "times": ["2018-12-05T11:00:00","2018-12-06T10:00:00"] }
'

# because ES takes time to add them to index 
sleep 2

echo "Query listings on the 6th!"

curl -XPOST 'localhost:9200/listings/_search?pretty' -d '
{
  "sort": {
    "times": {
      "order": "asc",
      "nested_filter": {
        "range": {
          "times": {
            "gte": "2018-12-06T00:00:00",
            "lte": "2018-12-06T23:59:59"
          }
        }
      }
    }
  },
  "query": {
    "bool": {
      "filter": {
        "range": {
          "times": {
            "gte": "2018-12-06T00:00:00",
            "lte": "2018-12-06T23:59:59"
          }
        }
      }
    }
  }
}'

curl -XDELETE 'localhost:9200/listings?pretty'

Run Code Online (Sandbox Code Playgroud)

将上述脚本添加到.sh文件并运行它有助于重现该问题.你会看到订单发生在5号而不是6号.Elasticsearch将时间转换epoch_millis为用于排序的数字,您可以在命中对象的排序字段中查看时期编号,例如1544007600000.在执行asc排序时,in采用数组中的最小数字(顺序不重要)并基于排序进行排序那.

不知何故,我需要在查询日即第6天发生的最早时间进行排序.

目前正在使用Elasticsearch 2.4,但即使有人可以告诉我它在当前版本中是如何完成的,那将是很棒的.

这是他们关于嵌套查询和脚本的文档,如果这有帮助的话.

Answer 1

Stu*_*ing 3

我认为这里的问题是嵌套排序适用于嵌套对象，而不是数组。

如果将文档转换为使用嵌套对象数组而不是简单的日期数组的文档，则可以构建有效的嵌套过滤排序。

以下是 Elasticsearch 6.0 - 他们从 6.1 开始对语法进行了一些更改，我不确定其中有多少适用于 2.x：

映射：

PUT nested-listings
{
  "mappings": {
    "listing": {
      "properties": {
        "name": {
          "type": "keyword"
        },
        "openTimes": {
          "type": "nested",
          "properties": {
            "date": {
              "type": "date"
            }
          }
        }
      }
    }
  }
}

Run Code Online (Sandbox Code Playgroud)

数据：

POST nested-listings/listing/_bulk
{"index": { } }
{ "name": "second on 6th (3rd on the 5th)", "openTimes": [ { "date": "2018-12-05T12:00:00" }, { "date": "2018-12-06T11:00:00" }] }
{"index": { } }
{ "name": "third on 6th (1st on the 5th)", "openTimes": [ {"date": "2018-12-05T10:00:00"}, { "date": "2018-12-06T12:00:00" }] }
{"index": { } }
{ "name": "first on the 6th (2nd on the 5th)", "openTimes": [ {"date": "2018-12-05T11:00:00" }, { "date": "2018-12-06T10:00:00" }] }

Run Code Online (Sandbox Code Playgroud)

因此，我们有一个“openTimes”嵌套对象，而不是“nextNextOpenTimes”，并且每个列表都包含一个 openTimes 数组。

现在搜索：

POST nested-listings/_search
{
  "sort": {
    "openTimes.date": {
      "order": "asc",
      "nested_path": "openTimes",
      "nested_filter": {
        "range": {
          "openTimes.date": {
            "gte": "2018-12-06T00:00:00",
            "lte": "2018-12-06T23:59:59"
          }
        }
      }
    }
  },
  "query": {
    "nested": {
      "path": "openTimes",
      "query": {
        "bool": {
          "filter": {
            "range": {
              "openTimes.date": {
                "gte": "2018-12-06T00:00:00",
                "lte": "2018-12-06T23:59:59"
              }
            }
          }
        }
      }
    }
  }
}

Run Code Online (Sandbox Code Playgroud)

这里的主要区别是查询略有不同，因为您需要使用“嵌套”查询来过滤嵌套对象。

这给出了以下结果：

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": null,
    "hits": [
      {
        "_index": "nested-listings",
        "_type": "listing",
        "_id": "vHH6e2cB28sphqox2Dcm",
        "_score": null,
        "_source": {
          "name": "first on the 6th (2nd on the 5th)"
        },
        "sort": [
          1544090400000
        ]
      },
      {
        "_index": "nested-listings",
        "_type": "listing",
        "_id": "unH6e2cB28sphqox2Dcm",
        "_score": null,
        "_source": {
          "name": "second on 6th (3rd on the 5th)"
        },
        "sort": [
          1544094000000
        ]
      },
      {
        "_index": "nested-listings",
        "_type": "listing",
        "_id": "u3H6e2cB28sphqox2Dcm",
        "_score": null,
        "_source": {
          "name": "third on 6th (1st on the 5th)"
        },
        "sort": [
          1544097600000
        ]
      }
    ]
  }
}

Run Code Online (Sandbox Code Playgroud)

我不认为你实际上可以从 ES 中的数组中选择单个值，因此对于排序，你总是要对所有结果进行排序。对于普通数组，您能做的最好的事情就是选择如何处理该数组以进行排序（使用最低、最高、平均值等）。

归档时间：	7 年前
查看次数：	280 次
最近记录：	7 年前