使用父子关系重新索引 Elasticsearch 索引

Ben*_*Ben 4 parent-child elasticsearch reindex

我们目前有一条“消息”,可以包含指向“父”消息的链接。例如,回复会将原始消息作为parent_id。

PUT {
  "mappings": {
    "message": {
      "properties": {
        "subject": {
          "type": "text"
         },
         "body" : {
            "type" : "text"
         },
         "parent_id" : {
            "type" : "long"
          }
        }
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

目前,我们在文档上没有 Elasticsearch 父子联接,因为父级和子级不允许具有相同的类型。现在,在 5.6 中,通过弹性驱动来摆脱类型,我们现在尝试在 5.6 中使用新的父子连接。

PUT {
  "settings": {
    "mapping.single_type": true
  },
  "mappings": {
    "message": {
      "properties": {
        "subject": {
          "type": "text"
         },
         "body" : {
            "type" : "text"
         },
         "join_field": {
            "type" : "join",
            "relations": {
                "parent_message":"child_message"
            }
        }
        }
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

我知道我必须为此创建一个新索引,然后使用 _reindex 重新索引所有内容,但我不太确定如何做到这一点。

如果我索引一个parent_message,那就很简单

PUT localhost:9200/testm1/message/1 
{
        "subject": "Message 1",
         "body" : "body 1"
}
PUT localhost:9200/testm1/message/3?routing=1
{
        "subject": "Message Reply to 1",
         "body" : "body 3",
          "join_field": {
            "name": "child_message",
            "parent": "1"
    }
 }
Run Code Online (Sandbox Code Playgroud)

现在将返回搜索

{
                "_index": "testm1",
                "_type": "message",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "subject": "Message 2",
                    "body": "body 2"
                }
            },
            {
                "_index": "testm1",
                "_type": "message",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "subject": "Message 1",
                    "body": "body 1"
                }
            },
            {
                "_index": "testm1",
                "_type": "message",
                "_id": "3",
                "_score": 1,
                "_routing": "1",
                "_source": {
                    "subject": "Message Reply to 1",
                    "body": "body 3",
                    "join_field": {
                        "name": "child_message",
                        "parent": "1"
                    }
                }
            }
Run Code Online (Sandbox Code Playgroud)

我尝试创建新索引(testmnew),然后执行 _reindex

POST _reindex
{
    "source": {
        "index" : "testm"
    },
    "dest" :{
        "index" : "testmnew"
    },
    "script" : {
        "inline" : """
        ctx._routing = ctx._source.parent_id;
 --> Missing need to set join_field here as well I guess <--
        """
        }
}
Run Code Online (Sandbox Code Playgroud)

脚本对我来说仍然不太清楚。但我现在走的路正确吗?我是否可以简单地在消息上设置_routing(在父消息上为空)。但是我如何仅为子消息设置 join_field 呢?

Ben*_*Ben 6

这是我最终使用的重新索引脚本:

curl -XPOST 'localhost:9200/_reindex' -H 'Content-Type: application/json' -d'
{
    "source": {
        "index" : "testm"
    },
    "dest" :{
        "index" : "testmnew"
    },
    "script" : {
        "lang" : "painless",
        "source" : "if(ctx._source.parent_id != null){ctx._routing = ctx._source.parent_id; ctx._source.join_field=  params.cjoin; ctx._source.join_field.parent = ctx._source.parent_id;}else{ctx._source.join_field = params.parent_join}",
        "params" : {
            "cjoin" :{
                "name": "child_message",
                "parent": 1
            },
            "parent_join" : {"name": "parent_message"}

        }
    }
}
'
Run Code Online (Sandbox Code Playgroud)