elasticsearch中的多字段文本和关键字字段

Sni*_*olf 3 elasticsearch elasticsearch-5

我正在考虑从 solr 切换到 elasticsearch 并在不提供模式/映射的情况下将一堆文档编入索引,并且我之前在 solr 中设置为索引字符串的许多字段已设置为文本关键字使用多字段的字段

关键字字段也用作使用多字段文本字段是否有任何好处?在我的情况下,字段中的大多数值都是单个单词,所以我想如果它们被发送到分析器并不重要,但是 es 文档似乎暗示在搜索时不考虑关键字字段或至少以不同的方式对待?

只是为了进一步扩展一点,如果我搜索术语“ipad”,如果它在关键字字段中包含“ipad”以及其他一些文本字段与没有关键字字段的同一文档相比,文档得分会更高吗?如果说“ipad”仅在关键字字段中,文档是否仍然匹配?

Sni*_*olf 5

为了回答我自己的问题,我创建了一个快速测试,搜索时几乎关键字和文本字段是等效的,并且多字段似乎获得与其主要类型相同的分数,所以我猜第二个字段对搜索评分没有影响

奇怪的是,关键字和文本字段中的多字值得到了相同的分数,我希望关键字字段的分数更低或根本不得分,但出于我的目的,这很好,所以我不打算进一步调查。

索引创建

PUT test_index
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "test_type" : {
            "properties" : {
                "multifield": {
                  "type": "text",
                  "fields": {
                     "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                     }
                  }
                },

                "keywordfield": {
                  "type": "keyword"
                },

                "textfield": {
                  "type": "text"
                }

            }
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

数据插入

POST /_bulk
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 1 }
{ "doc" : { "multifield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 2 }
{ "doc" : { "keywordfield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 3 }
{ "doc" : { "keywordfield" : "a green ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 4 }
{ "doc" : { "textfield" : "a yellow ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 5 }
{ "doc" : { "keywordfield" : "ipad", "textfield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 6 }
{ "doc" : { "keywordfield" : "unrelated", "textfield" : "hopefully this wont show up"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 7 }
{ "doc" : { "textfield" : "ipad"  }, "doc_as_upsert" : true }
Run Code Online (Sandbox Code Playgroud)

结果

GET /test_index/_search?q=ipad
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 6,
      "max_score": 0.28122374,
      "hits": [
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "5",
            "_score": 0.28122374,
            "_source": {
               "keywordfield": "ipad",
               "textfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "1",
            "_score": 0.2734406,
            "_source": {
               "multifield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "2",
            "_score": 0.2734406,
            "_source": {
               "keywordfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "7",
            "_score": 0.2734406,
            "_source": {
               "textfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "3",
            "_score": 0.16417998,
            "_source": {
               "keywordfield": "a green ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "4",
            "_score": 0.16417998,
            "_source": {
               "textfield": "a yellow ipad"
            }
         }
      ]
   }
}
Run Code Online (Sandbox Code Playgroud)