Thi*_*ell 4 index index-tuning postgresql-9.5
我有一个包含超过 1200 万行日志数据的表,并且已迁移到 Postgres 9.5 以利用新的 BRIN 索引,因为我有磁盘空间限制。鉴于我的日志行按日期自然排序,我假设我的情况是为了 BRIN 索引而定制的。
然而,我是从结果开始的。BRIN 比 btree 慢一个数量级以上。
原始Btree索引:
EXPLAIN ANALYZE SELECT COUNT(*) from logline where date BETWEEN '2016-01-15' and '2016-01-31';
Aggregate (cost=153488.38..153488.39 rows=1 width=0) (actual time=7672.508..7672.509 rows=1 loops=1)
-> Index Only Scan using logline_date on logline (cost=0.43..145945.76 rows=3017046 width=0) (actual time=18.548..4084.455
rows=2977593 loops=1)
Index Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone))
Heap Fetches: 5809
Planning time: 0.293 ms
Execution time: 7672.562 ms
(6 rows)
DROP index logline_date
CREATE index logline_date_brin on logline using BRIN(date)
EXPLAIN ANALYZE SELECT COUNT(*) from logline where date BETWEEN '2016-01-15' and '2016-01-31';
Aggregate (cost=1230518.30..1230518.31 rows=1 width=0) (actual time=105789.131..105789.133 rows=1 loops=1)
-> Bitmap Heap Scan on logline (cost=31543.27..1222862.87 rows=3062173 width=0) (actual time=103.876..100675.372 rows=2977593 loops=1)
Recheck Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone))
Rows Removed by Index Recheck: 2899899
Heap Blocks: lossy=696832
-> Bitmap Index Scan on logline_date_brin (cost=0.00..30777.73 rows=3062173 width=0) (actual time=103.079..103.079 rows=6968320 loops=1)
Index Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone))
Planning time: 0.377 ms
Execution time: 105805.567 ms
(9 rows)
Run Code Online (Sandbox Code Playgroud)
BRIN 索引比 Btree 小 600 倍以上,但我没想到执行时间会慢这么多。
这是否意味着 BRIN 不适合我,或者我做错了什么?
我猜迁移时您没有导入按日期排序的行。您可以通过发出来检查这一点
select * from logline;
Run Code Online (Sandbox Code Playgroud)
并检查日期是否看起来单调增加。如果不是这种情况,您可以尝试对表进行排序,例如:
select * into logline2 from logline order by date asc;
Run Code Online (Sandbox Code Playgroud)
...在第二个表上创建索引...
CREATE index logline2_date_brin on logline2 using BRIN(date)
Run Code Online (Sandbox Code Playgroud)
...并尝试第二张表的“运气”:
EXPLAIN ANALYZE SELECT COUNT(*) from logline2 where date BETWEEN '2016-01-15' and '2016-01-31';
Run Code Online (Sandbox Code Playgroud)
如果时间明显好一些,那就完美了。
如果您有磁盘空间限制,您还应该检查扩展cstore_fdw。它非常适合分析,并且可以以压缩形式存储数据。它具有类似于 BRIN 的索引功能,但有一些限制:只能追加数据且不支持事务。
归档时间: |
|
查看次数: |
3488 次 |
最近记录: |