如何对嵌套字段进行源过滤

R.D*_*R.D 4 elasticsearch

样本文件

{
 "id" : "video1",
  "title" : "Gone with the wind",
  "timedTextLines" : [ 
    {
      "startTime" : "00:00:02",
      "endTime" :  "00:00:05",
      "textLine" : "Frankly my dear I don't give a damn."
    },
   {
      "startTime" : "00:00:07",
      "endTime" :  "00:00:09",
      "textLine" : " my amazing country."
    },
 {
      "startTime" : "00:00:17",
      "endTime" :  "00:00:29",
      "textLine" : " amazing country."
    }
  ]
}
Run Code Online (Sandbox Code Playgroud)

索引定义

{
  "mappings": {
    "video_type": {
      "properties": {
        "timedTextLines": {
          "type": "nested" 
        }
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

内部无源过滤的响应效果很好。

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.91737854,
    "hits": [
      {
        "_index": "video_index",
        "_type": "video_type",
        "_id": "1",
        "_score": 0.91737854,
        "_source": {

        },
        "inner_hits": {
          "timedTextLines": {
            "hits": {
              "total": 1,
              "max_score": 0.6296964,
              "hits": [
                {
                  "_nested": {
                    "field": "timedTextLines",
                    "offset": 0
                  },
                  "_score": 0.6296964,
                  "_source": {
                    "startTime": "00:00:02",
                    "endTime": "00:00:05",
                    "textLine": "Frankly my dear I don't give a damn."
                  },
                  "highlight": {
                    "timedTextLines.textLine": [
                      "Frankly my dear I don't give a <em>damn</em>."
                    ]
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}
Run Code Online (Sandbox Code Playgroud)

响应包含嵌套属性的所有属性。即startTime,endTime和textLine。如何在响应中仅返回结束时间和开始时间?

查询失败

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": "gone"
          }
        },
        {
          "nested": {
            "path": "timedTextLines",
            "query": {
              "match": {
                "timedTextLines.textLine": "damn"
              }
            },
            "inner_hits": {
             "_source":["startTime","endTime"],
              "highlight": {
                "fields": {
                  "timedTextLines.textLine": {

                  }
                }
              }
            }
          }
        }
      ]
    }
  },
  "_source":"false"
}
Run Code Online (Sandbox Code Playgroud)

错误 HTTP / 1.1 400错误的请求内容类型:application / json; charset = UTF-8内容长度:265

{“ error”:{“ root_cause”:[{“ type”:“ illegal_argument_exception”,“ reason”:“ [inner_hits] _source不支持以下类型的值:START_ARRAY”}],“ type”:“ illegal_argument_exception”, “原因”:“ [inner_hits] _source不支持以下类型的值:START_ARRAY”},“ status”:400}

Val*_*Val 5

原因是因为从ES 5.0开始,_sourcein inner_hits中不再支持短格式,而仅支持完整的对象格式(with includesexcludes)(请参阅此未解决的问题

您的查询可以这样重写,它将起作用:

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": "gone"
          }
        },
        {
          "nested": {
            "path": "timedTextLines",
            "query": {
              "match": {
                "timedTextLines.textLine": "damn"
              }
            },
            "inner_hits": {
             "_source": {
                "includes":[
                  "timedTextLines.startTime",
                  "timedTextLines.endTime"
                ]
             },
              "highlight": {
                "fields": {
                  "timedTextLines.textLine": {

                  }
                }
              }
            }
          }
        }
      ]
    }
  },
  "_source":"false"
}
Run Code Online (Sandbox Code Playgroud)