跨多个表的Mysql全文搜索相关性

mic*_*ael 13 mysql search full-text-search relevance

我的任务是创建一个网站范围的搜索功能.搜索需要查看文章,事件和页面内容

我之前在MySQL中使用过MATCH()/ AGAINST()并且知道如何获得结果的相关性,但据我所知,相关性对于搜索(内容,行数等)来说是唯一的.文章表格与事件表格中的结果的相关性不匹配.

无论如何要统一相关性,以便所有三个表的结果具有可比性?

joe*_*son 22

是的,您可以使用Apache Lucene和Solr等搜索引擎将它们统一起来.

http://lucene.apache.org/solr/

如果您只需要在MySQL中执行此操作,则可以使用UNION执行此操作.您可能希望抑制任何与零相关的结果.

您需要根据哪个表匹配来决定如何影响相关性.

例如,假设您希望文章最重要,事件中等重要,页面最不重要.你可以使用这样的乘数:

set @articles_multiplier=3;
set @events_multiplier=2;
set @pages_multiplier=1;
Run Code Online (Sandbox Code Playgroud)

这是一个可以尝试演示其中一些技术的工作示例:

创建样本数据:

create database d;
use d;

create table articles (id int primary key, content text) ENGINE = MYISAM;
create table events (id int primary key, content text) ENGINE = MYISAM;
create table pages (id int primary key, content text) ENGINE = MYISAM;

insert into articles values 
(1, "Lorem ipsum dolor sit amet"),
(2, "consectetur adipisicing elit"),
(3, "sed do eiusmod tempor incididunt");

insert into events values 
(1, "Ut enim ad minim veniam"),
(2, "quis nostrud exercitation ullamco"),
(3, "laboris nisi ut aliquip");

insert into pages values 
(1, "Duis aute irure dolor in reprehenderit"),
(2, "in voluptate velit esse cillum"),
(3, "dolore eu fugiat nulla pariatur.");
Run Code Online (Sandbox Code Playgroud)

使其可搜索:

ALTER TABLE articles ADD FULLTEXT(content);
ALTER TABLE events ADD FULLTEXT(content);
ALTER TABLE pages ADD FULLTEXT(content);
Run Code Online (Sandbox Code Playgroud)

使用UNION搜索所有这些表:

set @target='dolor';

SELECT * from (
  SELECT 
    'articles' as 'table_name', id, 
    @articles_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from articles
  UNION
  SELECT 
    'events' as 'table_name', 
    id,
    @events_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from events
  UNION
  SELECT 
    'pages' as 'table_name', 
    id, 
    @pages_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from pages
)
as sitewide WHERE relevance > 0;
Run Code Online (Sandbox Code Playgroud)

结果:

+------------+----+------------------+
| table_name | id | relevance        |
+------------+----+------------------+
| articles   |  1 | 1.98799377679825 |
| pages      |  3 | 0.65545331108093 |
+------------+----+------------------+
Run Code Online (Sandbox Code Playgroud)