小编P.P*_*ter的帖子

随着搜索字符串变长，Trigram 搜索变得更慢

在 Postgres 9.1 数据库中，我有一个table1包含约 1.5M 行和一列的表label（为了这个问题而简化了名称）。

上有一个功能性的三元组索引lower(unaccent(label))(unaccent()已被设为不可变以允许其在索引中使用)。

以下查询非常快：

SELECT count(*) FROM table1
WHERE (lower(unaccent(label)) like lower(unaccent('%someword%')));
 count 
-------
     1
(1 row)

Time: 394,295 ms

Run Code Online (Sandbox Code Playgroud)

但以下查询速度较慢：

SELECT count(*) FROM table1
WHERE (lower(unaccent(label)) like lower(unaccent('%someword and some more%')));
 count 
-------
     1
(1 row)

Time: 1405,749 ms

Run Code Online (Sandbox Code Playgroud)

添加更多单词甚至更慢，即使搜索更严格。

我尝试了一个简单的技巧来运行第一个单词的子查询，然后使用完整的搜索字符串进行查询，但是（可悲的是）查询计划员看穿了我的阴谋：

EXPLAIN ANALYZE
SELECT * FROM (
   SELECT id, title, label from table1
   WHERE lower(unaccent(label)) like lower(unaccent('%someword%'))
   ) t1
WHERE lower(unaccent(label)) like lower(unaccent('%someword and some more%'));

Run Code Online (Sandbox Code Playgroud)

表1上的位图堆扫描（成本=16216.01..16220.04行=1宽度=212）（实际时间=1824.017..1824.019行=1循环=1） …

postgresql full-text-search pattern-matching postgresql-9.1 postgresql-9.4

P.P*_*ter

2015 09-16

17
推荐指数

1
解决办法

2106
查看次数

添加子查询时，PostgreSQL 查询速度非常慢

我对一个有 1.5M 行的表有一个相对简单的查询：

SELECT mtid FROM publication
WHERE mtid IN (9762715) OR last_modifier=21321
LIMIT 5000;

Run Code Online (Sandbox Code Playgroud)

EXPLAIN ANALYZE 输出：

Limit  (cost=8.84..12.86 rows=1 width=8) (actual time=0.985..0.986 rows=1 loops=1)
  ->  Bitmap Heap Scan on publication  (cost=8.84..12.86 rows=1 width=8) (actual time=0.984..0.985 rows=1 loops=1)
        Recheck Cond: ((mtid = 9762715) OR (last_modifier = 21321))
        ->  BitmapOr  (cost=8.84..8.84 rows=1 width=0) (actual time=0.971..0.971 rows=0 loops=1)
              ->  Bitmap Index Scan on publication_pkey  (cost=0.00..4.42 rows=1 width=0) (actual time=0.295..0.295 rows=1 loops=1)
                    Index Cond: (mtid = 9762715)
              ->  Bitmap Index Scan on …

Run Code Online (Sandbox Code Playgroud)

postgresql performance subquery hibernate query-performance

P.P*_*ter

2020 06-15

13
推荐指数

2
解决办法

1万
查看次数

PostgreSQL 三元组 GIST 与 GIN 索引

我有一个 PostgreSQL 9.1 数据库，其中包含 10M+ 行和一些需要相似性和类似%word%搜索的文本字段，所以我决定使用 trigram 索引。

最初，我开始使用 GIN 索引，但现在我想知道我是否应该使用 GIST。

该理论说：

GIN 索引查找速度大约是 GiST 的三倍
GIN 索引的构建时间大约是 GiST 的三倍
GIN 索引的更新速度比 GiST 索引稍慢，但如果禁用快速更新支持，则大约慢 10 倍（有关详细信息，请参阅第 58.4.1 节）
GIN 索引比 GiST 索引大两到三倍

我试图同时创建GIN和GIST索引，并在实践中发现以下内容（在我的数据集上）：

对于LIKE '%word%'查询，GIN 的速度与第一次 GIST 大致相同（甚至慢 5-20%）（当查询中的三元组索引尚未缓存时）。
对于LIKE '%word%'查询，如果最近搜索了查询中的三元组，GIN 大约比 GIST 快5倍。无论缓存如何，GIST 始终具有相同的速度。
对于% 'word'（相似性）查询，GIN 大约比 GIST慢5-8倍，具体取决于索引的缓存性。

GIN比 GIST小约 10% 。但是，如果UPDATE表中有语句，它似乎增长得更快。除非，当然，我VACUUM FULL经常不够。

所以我看到了理论和实践之间的一些差异：

速度 …

postgresql full-text-search postgresql-9.1

P.P*_*ter

2015 08-25

11
推荐指数

0
解决办法

3546
查看次数

标签统计

postgresql ×3

full-text-search ×2

postgresql-9.1 ×2

hibernate ×1

pattern-matching ×1

performance ×1

postgresql-9.4 ×1

query-performance ×1

subquery ×1

随着搜索字符串变长，Trigram 搜索变得更慢

添加子查询时，PostgreSQL 查询速度非常慢

PostgreSQL 三元组 GIST 与 GIN 索引

标签 统计

小编P.P_ter的帖子

标签统计