Postgresql 9.5 BRIN 索引比预期慢得多

Thi*_*ell 4 index index-tuning postgresql-9.5

我有一个包含超过 1200 万行日志数据的表,并且已迁移到 Postgres 9.5 以利用新的 BRIN 索引,因为我有磁盘空间限制。鉴于我的日志行按日期自然排序,我假设我的情况是为了 BRIN 索引而定制的。

然而,我是从结果开始的。BRIN 比 btree 慢一个数量级以上。

原始Btree索引:

EXPLAIN ANALYZE SELECT COUNT(*) from logline where date BETWEEN '2016-01-15' and '2016-01-31';

 Aggregate  (cost=153488.38..153488.39 rows=1 width=0) (actual time=7672.508..7672.509 rows=1 loops=1)
   ->  Index Only Scan using logline_date on logline  (cost=0.43..145945.76 rows=3017046 width=0) (actual time=18.548..4084.455 
rows=2977593 loops=1)
         Index Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone))
         Heap Fetches: 5809
 Planning time: 0.293 ms
 Execution time: 7672.562 ms
(6 rows)


DROP index logline_date
CREATE index logline_date_brin on logline using BRIN(date)

 EXPLAIN ANALYZE SELECT COUNT(*) from logline where date BETWEEN '2016-01-15' and '2016-01-31';

 Aggregate  (cost=1230518.30..1230518.31 rows=1 width=0) (actual time=105789.131..105789.133 rows=1 loops=1)
   ->  Bitmap Heap Scan on logline  (cost=31543.27..1222862.87 rows=3062173 width=0) (actual time=103.876..100675.372 rows=2977593 loops=1)
         Recheck Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone))
         Rows Removed by Index Recheck: 2899899
         Heap Blocks: lossy=696832
         ->  Bitmap Index Scan on logline_date_brin  (cost=0.00..30777.73 rows=3062173 width=0) (actual time=103.079..103.079 rows=6968320 loops=1)
               Index Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone))
 Planning time: 0.377 ms
 Execution time: 105805.567 ms
(9 rows)
Run Code Online (Sandbox Code Playgroud)

BRIN 索引比 Btree 小 600 倍以上,但我没想到执行时间会慢这么多。

这是否意味着 BRIN 不适合我,或者我做错了什么?

hru*_*ske 5

我猜迁移时您没有导入按日期排序的行。您可以通过发出来检查这一点

select * from logline; 
Run Code Online (Sandbox Code Playgroud)

并检查日期是否看起来单调增加。如果不是这种情况,您可以尝试对表进行排序,例如:

select * into logline2 from logline order by date asc;
Run Code Online (Sandbox Code Playgroud)

...在第二个表上创建索引...

CREATE index logline2_date_brin on logline2 using BRIN(date)
Run Code Online (Sandbox Code Playgroud)

...并尝试第二张表的“运气”:

EXPLAIN ANALYZE SELECT COUNT(*) from logline2 where date BETWEEN '2016-01-15' and '2016-01-31';
Run Code Online (Sandbox Code Playgroud)

如果时间明显好一些,那就完美了。


如果您有磁盘空间限制,您还应该检查扩展cstore_fdw。它非常适合分析,并且可以以压缩形式存储数据。它具有类似于 BRIN 的索引功能,但有一些限制:只能追加数据且不支持事务。