dmk*_*kov 5 database postgresql amazon-rds sql-execution-plan
我试图测试PostgreSQL中不同分区模式的效率.我用一个表和不同的分区大小创建了几个数据库,然后在每个数据库中填充了数亿个记录.
现在我想测试所有这些查询并比较结果.但问题是,当我运行相同的查询时,我在一个数据库上收到了不同的实际时间.
例如,我EXPLAIN (buffers, analyze, verbose) SELECT count(*) FROM testdata WHERE dt = '2017-06-01 08:50:00'第一次执行:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=356199.96..356199.97 rows=1 width=8) (actual time=155757.190..155757.190 rows=1 loops=1)
Output: count(*)
Buffers: shared read=290555
-> Append (cost=0.00..354506.51 rows=677381 width=0) (actual time=165.565..155583.815 rows=665275 loops=1)
Buffers: shared read=290555
-> Seq Scan on public.testdata (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (testdata.dt = '2017-06-01 08:50:00'::timestamp without time zone)
-> Bitmap Heap Scan on public.testdata_2017_06_01 (cost=12682.26..354506.51 rows=677380 width=0) (actual time=165.562..155319.677 rows=665275 loops=1)
Recheck Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
Rows Removed by Index Recheck: 31006916
Heap Blocks: exact=25283 lossy=263451
Buffers: shared read=290555
-> Bitmap Index Scan on testdata_2017_06_01_dt_idx (cost=0.00..12512.91 rows=677380 width=0) (actual time=159.597..159.597 rows=665275 loops=1)
Index Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
Buffers: shared read=1821
Planning time: 75.419 ms
Execution time: 155757.564 ms
(17 rows)
Run Code Online (Sandbox Code Playgroud)
第二次:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=356199.96..356199.97 rows=1 width=8) (actual time=18871.164..18871.164 rows=1 loops=1)
Output: count(*)
Buffers: shared read=290555
-> Append (cost=0.00..354506.51 rows=677381 width=0) (actual time=148.208..18731.027 rows=665275 loops=1)
Buffers: shared read=290555
-> Seq Scan on public.testdata (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (testdata.dt = '2017-06-01 08:50:00'::timestamp without time zone)
-> Bitmap Heap Scan on public.testdata_2017_06_01 (cost=12682.26..354506.51 rows=677380 width=0) (actual time=148.204..18489.171 rows=665275 loops=1)
Recheck Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
Rows Removed by Index Recheck: 31006916
Heap Blocks: exact=25283 lossy=263451
Buffers: shared read=290555
-> Bitmap Index Scan on testdata_2017_06_01_dt_idx (cost=0.00..12512.91 rows=677380 width=0) (actual time=143.119..143.119 rows=665275 loops=1)
Index Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
Buffers: shared read=1821
Planning time: 2.105 ms
Execution time: 18871.482 ms
(17 rows)
Run Code Online (Sandbox Code Playgroud)
差异是巨大的(155秒对18秒),我的第一个是第一次缓存结果.但上面的解释不会hit在buffers部分中显示值.两次执行的估计费用也相同.为什么实际时间不同?是否有任何服务工作,如创建一个需要这么长时间的查询计划?
如果我尝试对另一个日期执行相同的查询,则情况会重复.如果PostgreSQL缓存查询计划,是否会使用每个新参数值重新生成?它对我没有意义,因为它在这种情况下像查询结果缓存一样工作.
一个侧面问题:什么是获得查询执行时间的纯结果来比较不同分区模式的更好方法?在MySQL中我可以SELECT SQL_NO_CACHE用来禁用查询中的任何缓存.在PostgreSQL中,DISCARD ALL我认为,但它对上面的解释没有帮助.
更新#1:
主testdata表的定义:
=> \d+ testdata;
Table "public.testdata"
Column | Type | Modifiers | Storage | Stats target | Description
------------+-----------------------------+---------------+----------+--------------+-------------
param1id | integer | not null | plain | |
param2id | integer | | plain | |
param3id | integer | | plain | |
devid | integer | | plain | |
nameid | integer | | plain | |
dt | timestamp without time zone | default now() | plain | |
value | double precision | | plain | |
svalue | character varying(255) | | extended | |
Triggers:
testdata_partition_insert_trigger BEFORE INSERT ON testdata FOR EACH ROW EXECUTE PROCEDURE create_partition_and_insert()
Child tables: testdata_2017_06_01,
testdata_2017_06_02,
... other child tables, one per day ...
testdata_2017_06_30
Run Code Online (Sandbox Code Playgroud)
儿童testdata_2017_06_01的定义:
\d+ testdata_2017_06_01;
Table "public.testdata_2017_06_01"
Column | Type | Modifiers | Storage | Stats target | Description
------------+-----------------------------+---------------+----------+--------------+-------------
param1id | integer | not null | plain | |
param2id | integer | | plain | |
param3id | integer | | plain | |
devid | integer | | plain | |
nameid | integer | | plain | |
dt | timestamp without time zone | default now() | plain | |
value | double precision | | plain | |
svalue | character varying(255) | | extended | |
Indexes:
"testdata_2017_06_01_devid_idx" btree (devid)
"testdata_2017_06_01_dt_idx" btree (dt)
"testdata_2017_06_01_nameid_idx" btree (nameid)
Check constraints:
"partition_check" CHECK (dt >= '2017-06-01 00:00:00'::timestamp without time zone AND dt <= '2017-06-01 23:59:59'::timestamp without time zone)
Inherits: testdata
Run Code Online (Sandbox Code Playgroud)
更新#2:
打开track_io_timing@rvs建议后,我重新启动了实例,图片发生了显着变化.首先,可释放内存从6GB增加到近8GB - 截图.其次,规划师开始使用Index Only Scan而不是Bitmap Heap Scan.然后,所有查询现在花费的时间要少得多(第一个查询大约需要2.4毫秒,而当它到达缓存时大约需要0.041毫秒.对于初始155和18秒).最后,它开始点击缓存(在上面的示例中,所有解释仅显示读取缓冲区).
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=4.59..4.60 rows=1 width=8) (actual time=0.014..0.014 rows=1 loops=1)
Output: count(*)
Buffers: shared hit=4
-> Append (cost=0.00..4.58 rows=2 width=0) (actual time=0.012..0.012 rows=0 loops=1)
Buffers: shared hit=4
-> Seq Scan on public.testdata (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (testdata.dt = '2017-06-15 18:10:10'::timestamp without time zone)
-> Index Only Scan using testdata_2017_06_15_dt_idx on public.testdata_2017_06_15 (cost=0.56..4.58 rows=1 width=0) (actual time=0.009..0.009 rows=0 loops=1)
Index Cond: (testdata_2017_06_15.dt = '2017-06-15 18:10:10'::timestamp without time zone)
Heap Fetches: 0
Buffers: shared hit=4
Planning time: 0.888 ms
Execution time: 0.044 ms
(13 rows)
Run Code Online (Sandbox Code Playgroud)
是否有大量数据插入溢出所有可用内存(8GB)?我们插入了一个包含40kk记录的分区,然后将其复制到另一个分区.我们是否应该期待这样的生产(我们将使用批量插入每秒插入大约1k条记录)?如果我们需要不时地重启数据库实例,这是非常可怕的.
谢谢你的帮忙!
您有多少数据量以及多少 RAM?
您的后续查询很可能会命中缓存(PostgreSQL 也将使用操作系统级缓存)。在您的会话中尝试set track_io_timing = true并查看 Postgres 为您报告的 I/O 计时。
由于 的设置不正确,估计可能会出现偏差effective_cache_size。
回答你的附带问题:不要寻求“纯粹”的时间。尝试复制真实场景。这个查询会重复运行吗?是否还会有其他查询并行运行?始终将所有内容运行多次。缓存大小调整是性能调整的重要组成部分。
| 归档时间: |
|
| 查看次数: |
703 次 |
| 最近记录: |