我有一个表定义如下
CREATE TABLE details_search (
id int4 NOT NULL PRIMARY KEY,
"search" tsvector NULL
);
CREATE INDEX details_search_idx ON details_search USING gin (search);
Run Code Online (Sandbox Code Playgroud)
我运行这个来了解它的大小:
SELECT pg_size_pretty(pg_relation_size('details_search')) relation_size,
pg_size_pretty(pg_total_relation_size('details_search')) total_relation_size,
pg_size_pretty(pg_table_size('details_search')) table_size,
pg_size_pretty(pg_indexes_size('details_search')) indexes_size;
Run Code Online (Sandbox Code Playgroud)
这些是结果
relation_size|total_relation_size|table_size|indexes_size|
-------------+-------------------+----------+------------+
800 MB |64 GB |57 GB |6830 MB |
Run Code Online (Sandbox Code Playgroud)
我只对执行短语搜索感兴趣,并且这些搜索是聚合使用的。当我使用不常见术语执行短语搜索时,一切正常。现在,当我使用具有常用术语的短语时,性能会受到很大影响。
这个查询花了 192 秒:
SELECT COUNT(id)
FROM details_search
WHERE search @@ phraseto_tsquery('simple', 'data management')
Run Code Online (Sandbox Code Playgroud)
这是查询计划(这里是一个漂亮的界面中的查询计划):
Output: count(id)
Buffers: shared hit=25942383 read=6354221 written=4588
I/O Timings: shared/local read=512605.708 write=122.864
-> Gather (cost=178176.43..178176.64 rows=2 …
Run Code Online (Sandbox Code Playgroud)