Tom*_*len 9 sql search-engine sql-order-by
我有我的搜索字词:
"Yellow large widgets"
Run Code Online (Sandbox Code Playgroud)
我将这些术语分为3个单词:
1 = "Yellow";
2 = "Large";
2 = "Widgets";
Run Code Online (Sandbox Code Playgroud)
然后我搜索:
SELECT * FROM widgets
WHERE (description LIKE '%yellow%' OR description LIKE '%large%' OR description LIKE 'widgets')
OR (title LIKE '%yellow%' OR title LIKE '%large%' OR title LIKE '%widgets%')
Run Code Online (Sandbox Code Playgroud)
如何根据这些偏差对结果进行排序?
理想的方法论
description.1 point.title.title出现都值得5 points.但我不知道在SQL中从哪里开始这样做.
Dam*_*ver 10
好的,让我们在临时表中包含您的搜索词:
CREATE TABLE #SearchTerms (Term varchar(50) not null)
insert into #SearchTerms (Term)
select 'yellow' union all
select 'large' union all
select 'widgets'
Run Code Online (Sandbox Code Playgroud)
让我们做一些愚蠢的事:
select
widgets.ID,
(LEN(description) - LEN(REPLACE(description,Term,''))) / LEN(Term) as DescScore
(LEN(title) - LEN(REPLACE(title,Term,''))) / LEN(Term) as TitleScore
from
widgets,#SearchTerms
Run Code Online (Sandbox Code Playgroud)
我们现在已经在描述和标题中计算了每个术语的每个出现次数.
所以现在我们可以对这些事件求和并加权:
select
widgets.ID,
SUM((LEN(description) - LEN(REPLACE(description,Term,''))) / LEN(Term) +
((LEN(title) - LEN(REPLACE(title,Term,''))) / LEN(Term) *5)) as CombinedScore
from
widgets,#SearchTerms
group by
Widgets.ID
Run Code Online (Sandbox Code Playgroud)
如果我们需要对此做更多的事情,我建议将上面的内容放在子选择中
select
w.*,CombinedScore
from
widgets.w
inner join
(select
widgets.ID,
SUM((LEN(description) - LEN(REPLACE(description,Term,''))) / LEN(Term) +
((LEN(title) - LEN(REPLACE(title,Term,''))) / LEN(Term) *5)) as CombinedScore
from
widgets,#SearchTerms
group by
Widgets.ID
) t
on
w.ID = t.ID
where
CombinedScore > 0
order by
CombinedScore desc
Run Code Online (Sandbox Code Playgroud)
(请注意,我假设所有这些示例中都有一个ID列,但可以将其扩展为在窗口小部件表中定义PK所需的列数)
这里真正的技巧是计算更大的文本体中单词的出现次数,这可以通过以下方式完成:
(LEN(text) - LEN(text with each occurrence of term removed)) / LEN(term)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3107 次 |
| 最近记录: |