如何在postgresql中为regexp_matches创建索引？

Question

如何在postgresql中为regexp_matches创建索引？

我有一个桌子产品

product_id | desciption                                     
============================================================
322919     | text {add}185{/add} text                       
322920     | text {add}184{/add} text {add}185{/add} text   
322921     | text {add}185{/add} text {add}187{/add} text

Run Code Online (Sandbox Code Playgroud)

使用like进行sql查询非常慢

SELECT product_id, desciption 
FROM product 
WHERE LOWER(desciption) like '%{add}185{/add}%'
> Time: 340,159s

Run Code Online (Sandbox Code Playgroud)

我只需要一个索引来搜索 {add}185{/add} 表达式。即需要为此表建立索引

SELECT product_id, regexp_matches (desciption, '(\{add\}\d+\{\/add\})', 'g') 
FROM product

Run Code Online (Sandbox Code Playgroud)

返回：

product_id | regexp_matches 
================================================================================
322919     | {"{add}185{/add}"}
322920     | {"{add}184{/add}"}
322920     | {"{add}185{/add}"}
322921     | {"{add}185{/add}"}
322921     | {"{add}187{/add}"}

Run Code Online (Sandbox Code Playgroud)

为数据采样创建索引哪个更好？
在“WHERE”中使用哪个表达式更好？

Answer 1

jja*_*nes 9

最简单的解决方案就是构建一个pg_trgm 索引。

 create extension pg_trgm;
 create index on product using gin (description gin_trgm_ops);

Run Code Online (Sandbox Code Playgroud)

然后您可以使用相同的查询，仅删除 LOWER 并将 LIKE 更改为 ILIKE。

这可能应该足够好了，但如果还不够好，您可以创建一个更有针对性的索引。您将需要创建一个辅助函数来执行聚合，因为您无法将聚合直接放入函数索引中。

create function extract_tokens(text) returns text[] immutable language sql as $$ 
   select array_agg(regexp_matches[1]) from 
      regexp_matches ($1, '\{add\}(\d+)\{\/add\}+','g') 
$$;

Run Code Online (Sandbox Code Playgroud)

请注意，我将捕获括号移入，因此它们只获取数字而不获取周围的标签，这看起来像是噪音。事实上，有一场比赛就证明他们在那里，我们不需要看到他们。

create index on product using gin (extract_tokens(description))

select * from product where extract_tokens(description) @> ARRAY['185'];

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，10 月前
查看次数：	2273 次
最近记录：	5 年，10 月前