ALZ*_*ALZ 5 postgresql performance index postgresql-9.5 postgresql-performance
我创建了这样的表(类似于http://use-the-index-luke.com/sql/example-schema/postgresql/performance-testing-scalability 中的示例)
CREATE TABLE scale_data (
section NUMERIC NOT NULL,
id1 NUMERIC NOT NULL, -- unique values simulating ID or Timestamp
id2 NUMERIC NOT NULL -- a kind of Type
);
Run Code Online (Sandbox Code Playgroud)
填充它:
INSERT INTO scale_data
SELECT sections.sections, sections.sections*10000 + gen.gen
, CEIL(RANDOM()*100)
FROM GENERATE_SERIES(1, 300) sections,
GENERATE_SERIES(1, 90000) gen
WHERE gen <= sections * 300;
Run Code Online (Sandbox Code Playgroud)
它生成了 13545000 条记录。
综合指数就可以了:
CREATE INDEX id1_id2_idx
ON public.scale_data
USING btree
(id1, id2);
Run Code Online (Sandbox Code Playgroud)
并选择#1:
select id2 from scale_data
where id2 in (50)
order by id1 desc
limit 500
Run Code Online (Sandbox Code Playgroud)
解释分析:
"Limit (cost=0.56..1177.67 rows=500 width=11) (actual time=0.046..5.124 rows=500 loops=1)"
" -> Index Only Scan Backward using id1_id2_idx on scale_data (cost=0.56..311588.74 rows=132353 width=11) (actual time=0.045..5.060 rows=500 loops=1)"
" Index Cond: (id2 = '50'::numeric)"
" Heap Fetches: 0"
"Planning time: 0.103 ms"
"Execution time: 5.177 ms"
Run Code Online (Sandbox Code Playgroud)
Select#2 --more values in IN - 计划已更改
select id2 from scale_data
where id2 in (50, 52)
order by id1 desc
limit 500
Run Code Online (Sandbox Code Playgroud)
解释分析#2:
"Limit (cost=0.56..857.20 rows=500 width=11) (actual time=0.061..8.703 rows=500 loops=1)"
" -> Index Only Scan Backward using id1_id2_idx on scale_data (cost=0.56..445780.74 rows=260190 width=11) (actual time=0.059..8.648 rows=500 loops=1)"
" Filter: (id2 = ANY ('{50,52}'::numeric[]))"
" Rows Removed by Filter: 25030"
" Heap Fetches: 0"
"Planning time: 0.153 ms"
"Execution time: 8.771 ms"
Run Code Online (Sandbox Code Playgroud)
为什么计划不同?为什么在 #1 中它确实显示为Index condition,但在 #2 Filter和索引扫描单元格的数量中。sql#1 不是像explain for sql#2 显示的那样遍历索引吗?
在实际/生产 DB #2 上的工作速度要慢得多,即使分别按 2 个键进行搜索也很快
PG 9.5
我不会让这个困扰你。FILTER在这种情况下,我相信这只意味着索引上有多个条件语句(这就是IN数组操作如何转换为,据我所知)。无论是哪一种,它们都是在Index Only Scan Backward. 它的工作方式与OR
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..1219.95 rows=500 width=11) (actual time=0.061..16.159 rows=500 loops=1)
-> Index Only Scan Backward using id1_id2_idx on scale_data (cost=0.56..679161.56 rows=278484 width=11) (actual time=0.060..16.086 rows=500 loops=1)
Filter: ((id2 = '50'::numeric) OR (id2 = '52'::numeric))
Rows Removed by Filter: 24673
Heap Fetches: 25173
Planning time: 0.206 ms
Execution time: 16.235 ms
(7 rows)
test=# EXPlAIN ANALYZE select id2 from scale_data
where id2 in (50, 52)
order by id1 desc
limit 500
;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..1153.17 rows=500 width=11) (actual time=0.072..18.604 rows=500 loops=1)
-> Index Only Scan Backward using id1_id2_idx on scale_data (cost=0.56..645299.05 rows=279930 width=11) (actual time=0.070..18.506 rows=500 loops=1)
Filter: (id2 = ANY ('{50,52}'::numeric[]))
Rows Removed by Filter: 24673
Heap Fetches: 25173
Planning time: 0.187 ms
Execution time: 18.695 ms
(7 rows)
Run Code Online (Sandbox Code Playgroud)