过滤elasticsearch中的空数组字段

Chi*_*ain 13 elasticsearch

我的文档结构类似于:

{
    title: string,
    description: string,
    privacy_mode: string,
    hidden: boolean,
    added_by: string,
    topics: array
}
Run Code Online (Sandbox Code Playgroud)

我正在尝试查询elasticsearch.但是我不想要任何带有空主题数组字段的文档.

下面是一个构建查询对象的函数:

function getQueryObject(data) {
    var orList = [{ "term": {"privacy_mode": "public", "hidden": false} }]
    if (data.user) {
        orList.push({ "term": {"added_by": data.user} });
    }

    var queryObj = {
        "fields": ["title", "topics", "added_by", "img_url", "url", "type"],
        "query": {
            "filtered" : {
                "query" : {
                    "multi_match" : {
                        "query" : data.query + '*',
                        "fields" : ["title^4", "topics", "description^3", "tags^2", "body^2", "keywords",
                                "entities", "_id"]
                    }
                },
                "filter" : {
                    "or": orList
                },
                "filter" : {
                    "limit" : {"value" : 15}
                },
                "filter": {
                   "script": {
                        "script": "doc['topics'].values.length > 0"
                   }
               }
            }
        }
    }
    return queryObj;
};
Run Code Online (Sandbox Code Playgroud)

这仍然给我带有空主题数组的元素.想知道什么是错的!

谢谢你的帮助

Ale*_*vik 15

您可能想要丢失过滤器.你的脚本方法会将所有主题值加载到内存中,如果你不是像他们那样面对,那将是非常浪费的.

此外,您的过滤器的结构是错误的.你不能有重复的值filter,但应该用bool-filter 包装它们.(这就是为什么你通常想要使用bool而不是and|or|not:http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/

最后,您可能希望size在搜索对象上指定,而不是使用limit-filter.

我做了一个可以玩的可运行的例子:https://www.found.no/play/gist/aa59b987269a24feb763

#!/bin/bash

export ELASTICSEARCH_ENDPOINT="http://localhost:9200"

# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type"}}
{"privacy_mode":"public","topics":["foo","bar"]}
{"index":{"_index":"play","_type":"type"}}
{"privacy_mode":"private","topics":[]}
'

# Do searches

curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "must": [
                        {
                            "term": {
                                "privacy_mode": "public"
                            }
                        }
                    ],
                    "must_not": [
                        {
                            "missing": {
                                "field": "topics"
                            }
                        }
                    ]
                }
            }
        }
    }
}
'
Run Code Online (Sandbox Code Playgroud)

  • 缺少过滤器已失效。这是用于ES 2.4的版本:https://www.elastic.co/guide/en/elasticsearch/reference/2.4/query-dsl-missing-filter.html。此链接表明您现在必须使用ES以后不推荐使用的[Missing Query](https://www.elastic.co/guide/zh-CN/elasticsearch/reference/2.4/query-dsl-missing-query.html) 2.2.0,现在您必须使用[存在查询](https://www.elastic.co/guide/zh-CN/elasticsearch/reference/5.1/query-dsl-exists-query.html) (2认同)

Qy *_*Zuo 8

关键字missing是从ES5.0开始删除,它建议使用exists(见这里):

curl -XGET 'localhost:9200/_search?pretty' -H 'Content-Type: 
application/json' -d'
{
    "query": {
        "bool": {
            "must_not": {
                   "exists": {
                       "field": "topics"
                   }
            }
        }
    }
}'
Run Code Online (Sandbox Code Playgroud)

  • "必须""存在"? (5认同)