我有三个这样的表:
电影:身份证,姓名
tag:id,name,value
已标记:id,movie(FK),tag(FK)
因此,每部电影都有自己的一组标签.我需要的是根据标签集检索类似的电影.我想说出按照匹配标签的数量排序的10部电影.
如果我创建如下的视图,它会让MySQL消失."tag"和"tagged"表中都有30k +条记录.
create view relatedtags as
select
entityLeft.id as id,
entityRight.id as rightId,
count(rightTagged.id) as matches
from
entity as entityLeft join tagged as leftTagged on leftTagged.entity = entityLeft.id,
entity as entityRight join tagged as rightTagged on rightTagged.entity = entityRight.id
where leftTagged.tag = rightTagged.tag
and entityLeft.id != entityRight.id
group by entityLeft.id, entityRight.id
Run Code Online (Sandbox Code Playgroud)
这将返回所有电影的列表,这些电影共享至少1个标签,并且<current_movie_id>
通过减少共同的标签数量来排序
SELECT movie.*, count(DISTINCT similar.tag) as shared_tags FROM movie INNER JOIN
( tagged AS this_movie INNER JOIN tagged AS similar USING (tag) )
ON similar.movie = movie.id
WHERE this_movie.movie=<current_movie_id>
AND movie.id != this_movie.movie
GROUP BY movie.id
ORDER BY shared_tags DESC
Run Code Online (Sandbox Code Playgroud)
希望能给你一些合作的东西