在电影数据库中,我存储用户对每部电影给出的评分(0到5星)。我在弹性搜索(版本1.2.2)中建立了以下文档结构的索引
"_index": "my_index"
"_type": "film",
"_id": "6629",
"_source": {
"id": "6629",
"title": "Fight Club",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 3 },
{ "user_id" : 4567, "rating_value" : 2 },
{ "user_id" : 7890, "rating_value" : 1 }
.....
]
}
"_index": "my_index"
"_type": "film",
"_id": "6630",
"_source": {
"id": "6630",
"title": "Pulp Fiction",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 1 },
{ "user_id" : 7654, "rating_value" : 2 },
{ "user_id" : 4321, "rating_value" : 5 }
.....
]
}
Run Code Online (Sandbox Code Playgroud)
等...
我的目标是在一次搜索中获得用户(假设用户1234)评分的所有电影,以及rating_value
如果我进行以下搜索
GET my_index/film/_search
{
"query": {
"match": {
"ratings.user_id": "1234"
}
}
}
Run Code Online (Sandbox Code Playgroud)
对于所有匹配的电影,我得到了整个文档,然后,我必须解析整个rating数组,以找出该数组中的哪个元素与我的查询匹配,以及与user_id 1234相关联的rating_value是什么。
理想情况下,我希望此查询的结果是
"hits": [ {
"_index": "my_index"
"_type": "film",
"_id": "6629",
"_source": {
"id": "6629",
"title": "Fight Club",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 3 }, // <= only the row that matches the query
]
},
"_index": "my_index"
"_type": "film",
"_id": "6630",
"_source": {
"id": "6630",
"title": "Pulp Fiction",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 1 }, // <= only the row that matches the query
]
}
} ]
Run Code Online (Sandbox Code Playgroud)
提前致谢
正如我之前的评论中所述,我设法使用聚合检索值。
以下是我是如何做到这一点的。
首先,我使用的映射:
PUT test/movie/_mapping
{
"properties": {
"title":{
"type": "string",
"index": "not_analyzed"
},
"ratings": {
"type": "nested"
}
}
}
Run Code Online (Sandbox Code Playgroud)
我选择不对标题建立索引,但您可以使用fields属性并将其保留为“原始”字段。
然后,电影索引:
PUT test/movie/6629
{
"title": "Fight Club",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 3 },
{ "user_id" : 4567, "rating_value" : 2 },
{ "user_id" : 7890, "rating_value" : 1 }
]
}
PUT test/movie/4456
{
"title": "Jumanji",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 4 },
{ "user_id" : 4567, "rating_value" : 3 },
{ "user_id" : 4630, "rating_value" : 5 }
]
}
PUT test/movie/6547
{
"title": "Hook",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 4 },
{ "user_id" : 7890, "rating_value" : 1 }
]
}
Run Code Online (Sandbox Code Playgroud)
聚合查询是:
GET test/movie/_search
{
"aggs": {
"by_movie": {
"terms": {
"field": "title"
},
"aggs": {
"ratings_by_user": {
"nested": {
"path": "ratings"
},"aggs": {
"for_user_1234": {
"filter": {
"term": {
"ratings.user_id": "1234"
}
},
"aggs": {
"rating_value": {
"terms": {
"field": "ratings.rating_value"
}
}
}
}
}
}
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
最后,这是针对先前文档执行此查询时产生的输出:
"aggregations": {
"by_movie": {
"buckets": [
{
"key": "Fight Club",
"doc_count": 1,
"ratings_by_user": {
"doc_count": 3,
"for_user_1234": {
"doc_count": 1,
"rating_value": {
"buckets": [
{
"key": 3,
"key_as_string": "3",
"doc_count": 1
}
]
}
}
}
},
{
"key": "Hook",
"doc_count": 1,
"ratings_by_user": {
"doc_count": 2,
"for_user_1234": {
"doc_count": 1,
"rating_value": {
"buckets": [
{
"key": 4,
"key_as_string": "4",
"doc_count": 1
}
]
}
}
}
},
{
"key": "Jumanji",
"doc_count": 1,
"ratings_by_user": {
"doc_count": 3,
"for_user_1234": {
"doc_count": 1,
"rating_value": {
"buckets": [
{
"key": 4,
"key_as_string": "4",
"doc_count": 1
}
]
}
}
}
}
]
}
Run Code Online (Sandbox Code Playgroud)
}
由于嵌套语法,这有点乏味,但您将能够检索所提供用户对每部电影的评分(此处为 1234)。
希望这可以帮助!
| 归档时间: |
|
| 查看次数: |
4493 次 |
| 最近记录: |