Ale*_*øld 5 postgresql index execution-plan postgresql-11
Postgres 似乎总是使用顺序扫描,它可以使用部分索引来仅获取索引扫描。它仅在一个从句超过 100 个元素时发生。
鉴于下表:
create table foo(id bigint primary key, bar bigint);
insert into foo (id, bar)
select g.id, case when id % 1000 = 0 then id else null end
from generate_series(1, 10000000) AS g (id) ;
--Create partial index
create unique index ix_foo_bar on foo(bar) where bar is not null;
analyze foo;
Run Code Online (Sandbox Code Playgroud)
并给出以下带有大语句的查询:
explain analyze select count(*) from foo where bar in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101);
Run Code Online (Sandbox Code Playgroud)
查询计划显示顺序扫描。它很慢,而且成本很高:
QUERY PLAN
------------------------------------
Finalize Aggregate (cost=612955.35..612955.36 rows=1 width=8) (actual time=254.605..254.605 rows=1 loops=1)
-> Gather (cost=612955.13..612955.34 rows=2 width=8) (actual time=254.474..258.242 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=611955.13..611955.14 rows=1 width=8) (actual time=247.743..247.744 rows=1 loops=3)
-> Parallel Seq Scan on foo (cost=0.00..611955.03 rows=42 width=0) (actual time=247.740..247.740 rows=0 loops=3)
Filter: (bar = ANY ('{1,2,3,4,5(...)
Rows Removed by Filter: 3333333
Planning Time: 0.867 ms
Execution Time: 258.323 ms
Run Code Online (Sandbox Code Playgroud)
set enable_seqscan 没有效果 - 它仍然执行顺序扫描。
如果我向查询添加“非空”,它会使用索引:
explain analyze select count(*) from foo where bar is not null and bar in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101);
Run Code Online (Sandbox Code Playgroud)
仅索引扫描:
QUERY PLAN
----------------------------------------------------------------------
Aggregate (cost=153.55..153.56 rows=1 width=8) (actual time=0.267..0.267 rows=1 loops=1)
-> Index Only Scan using ix_foo_bar on foo (cost=0.29..153.55 rows=1 width=0) (actual time=0.262..0.262 rows=0 loops=1)
Index Cond: (bar = ANY ('{1,2,3,4,5,6 (...), 101}'::bigint[]))
Heap Fetches: 0
Planning Time: 0.531 ms
Execution Time: 0.319 ms
(6 rows)
Run Code Online (Sandbox Code Playgroud)
如果我在子句中只有较少的元素(截止值为 100 对 101),或者如果我有完整索引而不是部分索引,它也会使用索引。
当我有一个包含超过 100 个元素的子句时,为什么 Postgres 不使用部分索引?这是查询规划器的已知限制,还是错误?
该问题将在即将发布的版本 12 中修复。
我认为这里的总结是,我们只愿意做这么多工作来尝试证明可以使用部分索引,因为所有查询都必须完成这项工作,即使它们最终没有使用部分索引。在此更改中,他们只是找到了一种更有效的方法来在这种特定情况下完成该工作,因此不再对其施加 100 个元素的限制。