kev*_*b11 6 postgresql search full-text-search
当我将 ts_rank 与包含多个带有 & 运算符的术语的 ts_query 一起使用时,术语的接近度会影响排名并创建我意想不到的结果。一个例子:
select ts_rank(to_tsvector('why in the world is this not working?'), plainto_tsquery('world working'));
RESULT: 0.095243
select ts_rank(to_tsvector('why in the world is this not - at least as I would expect - working?'), plainto_tsquery('world working'));
RESULT: 0.0397712
select ts_rank(to_tsvector('why in the world is this not - at least as I would expect - working? I just do not get it'), plainto_tsquery('world working'));
RESULT: 0.0397712
Run Code Online (Sandbox Code Playgroud)
在文档中 ts_rank 被描述为简单地测量匹配频率。
ts_rank([ 权重 float4[], ] 矢量 tsvector, 查询 tsquery [, 归一化整数 ]) 返回 float4 根据匹配词位的频率对向量进行排名。
然而,上面的示例似乎正在测量频率,并且在多术语查询的情况下,还测量邻近度。
在下面的示例中,这给我带来了意想不到的结果:
select ts_rank(to_tsvector('why in the world is this not - at least as I would expect - working?'), plainto_tsquery('world'));
RESULT: 0.0607927
select ts_rank(to_tsvector('why in the world is this not - at least as I would expect - working?'), plainto_tsquery('world working'));
RESULT: 0.0397712
Run Code Online (Sandbox Code Playgroud)
我希望该文档在第二个查询中排名更高,因为它与查询中的多个术语匹配,但它的排名却较低。
有没有办法阻止这种行为?我对 ts_rank 或如何使用它有什么误解吗?
| 归档时间: |
|
| 查看次数: |
267 次 |
| 最近记录: |