我阅读了有关postgres bloom的文档,但无法重现相同的结果,请帮助我了解我错过了什么。我的服务器是:
SHOW server_version;
server_version
-------------------------------
10.6 (Debian 10.6-1.pgdg90+1)
dev=# show random_page_cost;
random_page_cost
------------------
4
Run Code Online (Sandbox Code Playgroud)
首先使用与文档中相同的命令创建表:
dev=# CREATE TABLE tbloom AS
SELECT
(random() * 1000000)::int as i1,
(random() * 1000000)::int as i2,
(random() * 1000000)::int as i3,
(random() * 1000000)::int as i4,
(random() * 1000000)::int as i5,
(random() * 1000000)::int as i6
FROM
generate_series(1,10000000);
Run Code Online (Sandbox Code Playgroud)
接下来我创建 btree 索引
dev=# CREATE index btreeidx ON tbloom (i1, i2, i3, i4, i5, i6);
CREATE INDEX
Run Code Online (Sandbox Code Playgroud)
并获得下一个计划:
dev=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..127195.10 rows=1 width=24) (actual time=258.963..260.900 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on tbloom (cost=0.00..126195.00 rows=1 width=24) (actual time=255.446..255.446 rows=0 loops=3)
Filter: ((i2 = 898732) AND (i5 = 123451))
Rows Removed by Filter: 3333333
Planning time: 0.412 ms
Execution time: 260.939 ms
Run Code Online (Sandbox Code Playgroud)
执行时间:260.939 毫秒
现在删除 btree 索引并创建bloom:
dev=# DROP INDEX btreeidx;
DROP INDEX
dev=# CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3, i4, i5, i6);
CREATE INDEX
Run Code Online (Sandbox Code Playgroud)
获得新计划:
dev=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..127195.10 rows=1 width=24) (actual time=260.278..261.989 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on tbloom (cost=0.00..126195.00 rows=1 width=24) (actual time=256.224..256.224 rows=0 loops=3)
Filter: ((i2 = 898732) AND (i5 = 123451))
Rows Removed by Filter: 3333333
Planning time: 0.165 ms
Execution time: 262.053 ms
Run Code Online (Sandbox Code Playgroud)
执行时间:文档中为262.053 毫秒
Bloom 比 btree 好
但不是在我的测试中。我尝试了几个 Length 选项,但没有找到好的结果。
在 9.6 中引入布隆时,并行查询刚刚引入并且默认关闭。在给定的示例中,Bloom 似乎比非并行顺序扫描更好。但是当您可以进行并行 seq 扫描时,它似乎比使用布隆索引更好。它实际上并没有更好,因为可以通过关闭并行查询set max_parallel_workers_per_gather TO 0并查看实际执行速度来验证,但规划器认为并行 seq 扫描会更好。看起来,bloom 的成本估算部分可能需要做一些工作。
在 v10 中默认打开并行查询时,示例代码没有更新,因此它不再像宣传的那样工作。
请注意,您的示例根本没有实现任何索引使用,因此您无法真正得出关于哪种索引更适合该场景的任何结论。