弹性搜索multi_match cross_fields前缀

Bou*_*egh 9 prefix elasticsearch

我有一个multi_match类型的查询cross_fields,我希望通过前缀匹配来改进.

{
  "index": "companies",
  "size": 25,
  "from": 0,
  "body": {
    "_source": {
      "include": [
        "name",
        "address"
      ]
    },
    "query": {
      "filtered": {
        "query": {
          "multi_match": {
            "type": "cross_fields",
            "query": "Google",
            "operator": "and",
            "fields": [
              "name",
              "address"
            ]
          }
        }
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

它完全匹配查询,如google mountain view.该filtered阵列是存在的,因为我需要动态地添加地理过滤器.

{
  "id": 1,
  "name": "Google",
  "address": "Mountain View"
} 
Run Code Online (Sandbox Code Playgroud)

现在我想允许前缀匹配,而不会破坏cross_fields.

这些查询应匹配:

  • goog
  • google mount
  • google mountain vi
  • mountain view goo

如果我更改multi_match.typephrase_prefix,则它将整个查询与单个字段匹配,因此它仅匹配mountain vi但不匹配google mountain vi

我该如何解决这个问题?

小智 6

由于没有答案,有人可能会看到这一点,我遇到了同样的问题,这是一个解决方案:

使用edgeNGrams 分词器

您需要更改索引设置和映射。

以下是设置示例:

"settings" : {
  "index" : {
    "analysis" : {
      "analyzer" : {
        "ngram_analyzer" : {
          "type" : "custom",
          "stopwords" : "_none_",
          "filter" : [ "standard", "lowercase", "asciifolding", "word_delimiter", "no_stop", "ngram_filter" ],
          "tokenizer" : "standard"
        },
        "default" : {
          "type" : "custom",
          "stopwords" : "_none_",
          "filter" : [ "standard", "lowercase", "asciifolding", "word_delimiter", "no_stop" ],
          "tokenizer" : "standard"
        }
      },
      "filter" : {
        "no_stop" : {
          "type" : "stop",
          "stopwords" : "_none_"
        },
        "ngram_filter" : {
          "type" : "edgeNGram",
          "min_gram" : "2",
          "max_gram" : "20"
        }
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

当然,您应该根据自己的用例调整分析器。您可能希望保持默认分析器不变或向其中添加 ngram 过滤器,这样您就不必更改映射。最后一个解决方案意味着索引中的所有字段都将获得 ngram 过滤器。

对于映射:

"mappings" : {
  "patient" : {
    "properties" : {
      "name" : {
        "type" : "string",
        "analyzer" : "ngram_analyzer"
      },
      "address" : {
        "type" : "string",
        "analyzer" : "ngram_analyzer"
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

使用 ngram_analyzer 声明要自动完成的每个字段。那么你的问题中的查询应该有效。如果你使用其他东西,我很高兴听到它。