Mar*_*hie 6 postgresql performance postgresql-performance
我们想将我们的数据库从虚拟主机提供商 (postgres 9.0) 移动到我们的本地网络服务器(尝试了 postgres 10 和最新的 11)我们的机器是 windows 服务器,具有 16gb ram 的快速 XEON 机器,仅用于数据库。
但即使在提高 default_statistics_targer = 4000 并分析统计数据之后,我们也无法运行许多以前运行非常快的视图。似乎虚拟主机提供商的服务器经过了微调,我们的执行计划可能出于某种原因很奇怪。
我们的 Postgres 是库存安装配置。
简化的示例查询如下(最大的表是“dale 表,它有几百万条记录大(它是带有外键的绑定表)其他表要小得多,一万条记录(系统是真空分析的,它是全新安装的)
EXPLAIN ANALYZE
SELECT REKLAMACNY_LIST.ID REKLAMACNY_LIST_ID
FROM REKLAMACNY_LIST
WHERE REKLAMACNY_LIST.VALIDTO IS NULL
AND
( SELECT NOT tr.typ_odstupenia::boolean
AND sr.konecny_stav::boolean
FROM dale d1
CROSS JOIN typ_reklamacie tr
CROSS JOIN dale d2
CROSS JOIN stav_reklamacie sr
WHERE TRUE
AND d1.fk7 = reklamacny_list.id
AND d2.fk7 = reklamacny_list.id
AND d1.fk1 = tr.id
AND d2.fk3 = sr.id
AND sr.validto IS NULL
AND tr.validto IS NULL
AND d1.validto IS NULL
AND d2.validto IS NULL )
ORDER BY reklamacny_list_id DESC
LIMIT 100
Run Code Online (Sandbox Code Playgroud)
解释我们测试 postgres 11 服务器的分析
"Limit (cost=0.29..18432.84 rows=100 width=4) (actual time=11804.484..331036.595 rows=91 loops=1)"
" -> Index Scan Backward using reklamacny_list_pk on reklamacny_list (cost=0.29..2578713.84 rows=13990 width=4) (actual time=11804.482..331036.524 rows=91 loops=1)"
" Index Cond: (id > 0)"
" Filter: ((validto IS NULL) AND (SubPlan 1))"
" Rows Removed by Filter: 29199"
" SubPlan 1"
" -> Hash Join (cost=5.30..87.57 rows=250 width=1) (actual time=5.246..11.824 rows=1 loops=27981)"
" Hash Cond: (d2.fk3 = sr.id)"
" -> Merge Join (cost=1.85..80.76 rows=324 width=9) (actual time=5.222..11.806 rows=6 loops=27981)"
" Merge Cond: (d1.fk1 = tr.id)"
" -> Nested Loop (cost=0.71..25556.34 rows=324 width=8) (actual time=5.211..11.794 rows=6 loops=27981)"
" -> Index Scan using dale_idx_fk1 on dale d1 (cost=0.29..25510.95 rows=18 width=4) (actual time=5.195..11.772 rows=1 loops=27981)"
" Filter: (fk7 = reklamacny_list.id)"
" Rows Removed by Filter: 28432"
" -> Materialize (cost=0.42..41.38 rows=18 width=4) (actual time=0.011..0.015 rows=6 loops=27890)"
" -> Index Scan using dale_fk7_idx on dale d2 (cost=0.42..41.29 rows=18 width=4) (actual time=0.007..0.010 rows=6 loops=27890)"
" Index Cond: (fk7 = reklamacny_list.id)"
" -> Sort (cost=1.14..1.15 rows=6 width=9) (actual time=0.009..0.009 rows=1 loops=27890)"
" Sort Key: tr.id"
" Sort Method: quicksort Memory: 25kB"
" -> Seq Scan on typ_reklamacie tr (cost=0.00..1.06 rows=6 width=9) (actual time=0.002..0.004 rows=6 loops=27890)"
" Filter: (validto IS NULL)"
" -> Hash (cost=2.74..2.74 rows=57 width=9) (actual time=0.046..0.047 rows=57 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 11kB"
" -> Seq Scan on stav_reklamacie sr (cost=0.00..2.74 rows=57 width=9) (actual time=0.012..0.030 rows=57 loops=1)"
" Filter: (validto IS NULL)"
" Rows Removed by Filter: 17"
"Planning Time: 10.174 ms"
"Execution Time: 331036.893 ms"
Run Code Online (Sandbox Code Playgroud)
在我们的测试 postgres 10.0 上执行需要永远而不是立即结果
"Limit (cost=0.29..163635.75 rows=100 width=4) (actual time=24.199..925.691 rows=70 loops=1)"
" -> Index Scan Backward using reklamacny_list_pk on reklamacny_list (cost=0.29..21326610.37 rows=13033 width=4) (actual time=24.195..925.660 rows=70 loops=1)"
" Index Cond: (id > 0)"
" Filter: ((validto IS NULL) AND (SubPlan 1))"
" Rows Removed by Filter: 27218"
" SubPlan 1"
" -> Nested Loop (cost=4.22..781.03 rows=1293 width=1) (actual time=0.018..0.034 rows=1 loops=26066)"
" -> Hash Join (cost=3.80..377.12 rows=76 width=5) (actual time=0.005..0.005 rows=1 loops=26066)"
" Hash Cond: (d2.fk3 = sr.id)"
" -> Index Scan using dale_fk7_idx on dale d2 (cost=0.42..372.47 rows=100 width=4) (actual time=0.002..0.004 rows=5 loops=26066)"
" Index Cond: (fk7 = reklamacny_list.id)"
" -> Hash (cost=2.71..2.71 rows=54 width=9) (actual time=0.049..0.049 rows=54 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 11kB"
" -> Seq Scan on stav_reklamacie sr (cost=0.00..2.71 rows=54 width=9) (actual time=0.016..0.032 rows=54 loops=1)"
" Filter: (validto IS NULL)"
" Rows Removed by Filter: 17"
" -> Materialize (cost=0.42..374.87 rows=17 width=24) (actual time=0.013..0.027 rows=1 loops=25987)"
" -> Nested Loop (cost=0.42..374.78 rows=17 width=24) (actual time=0.010..0.024 rows=1 loops=25987)"
" Join Filter: (d1.fk1 = tr.id)"
" Rows Removed by Join Filter: 32"
" -> Seq Scan on typ_reklamacie tr (cost=0.00..1.06 rows=1 width=28) (actual time=0.001..0.002 rows=6 loops=25987)"
" Filter: (validto IS NULL)"
" -> Index Scan using dale_fk7_idx on dale d1 (cost=0.42..372.47 rows=100 width=4) (actual time=0.002..0.003 rows=5 loops=155922)"
" Index Cond: (fk7 = reklamacny_list.id)"
"Planning time: 8.460 ms"
"Execution time: 925.870 ms"
Run Code Online (Sandbox Code Playgroud)
这只是一个例子。但是几乎每个查询在 11 上都慢了很多倍,甚至在 10 上永远需要的事情也会立即由网络提供商 postgres 9.0(它还托管数百个不同的数据库)返回
你有什么问题值得研究吗?
调整一些内存参数有帮助吗?(服务器有 16gb 仅用于 postgres 和操作系统,将有大约 50 个用户连接)实际上提高 default_statisticts_target=10000 有很大帮助,但即便如此。
使用合并的不同版本的请求,否则相同
EXPLAIN ANALYZE
SELECT REKLAMACNY_LIST.ID REKLAMACNY_LIST_ID
FROM REKLAMACNY_LIST
WHERE REKLAMACNY_LIST.VALIDTO IS NULL
AND REKLAMACNY_LIST.ID > 0
AND ((
( SELECT (NOT COALESCE(tr.typ_odstupenia, 'False')::boolean)
AND COALESCE(sr.konecny_stav, 'False'):: boolean
FROM dale d1
CROSS JOIN typ_reklamacie tr
CROSS JOIN dale d2
CROSS JOIN stav_reklamacie sr
WHERE TRUE
AND d1.fk7 = reklamacny_list.id
AND d2.fk7 = reklamacny_list.id
AND d1.fk1 = tr.id
AND d2.fk3 = sr.id
AND sr.validto IS NULL
AND tr.validto IS NULL
AND d1.validto IS NULL
AND d2.validto IS NULL )))
ORDER BY reklamacny_list_id DESC
LIMIT 100
Run Code Online (Sandbox Code Playgroud)
在 Postgres 11 上它跳到 10 秒(与之前没有合并的请求版本有很大的不同)
"Limit (cost=0.29..18432.84 rows=100 width=4) (actual time=447.853..10695.583 rows=100 loops=1)"
" -> Index Scan Backward using reklamacny_list_pk on reklamacny_list (cost=0.29..2578713.84 rows=13990 width=4) (actual time=447.851..10695.495 rows=100 loops=1)"
" Index Cond: (id > 0)"
" Filter: ((validto IS NULL) AND (SubPlan 1))"
" Rows Removed by Filter: 687"
" SubPlan 1"
" -> Hash Join (cost=5.30..87.57 rows=250 width=1) (actual time=11.436..14.102 rows=1 loops=758)"
" Hash Cond: (d2.fk3 = sr.id)"
" -> Merge Join (cost=1.85..80.76 rows=324 width=9) (actual time=11.407..14.076 rows=5 loops=758)"
" Merge Cond: (d1.fk1 = tr.id)"
" -> Nested Loop (cost=0.71..25556.34 rows=324 width=8) (actual time=11.389..14.056 rows=5 loops=758)"
" -> Index Scan using dale_idx_fk1 on dale d1 (cost=0.29..25510.95 rows=18 width=4) (actual time=11.361..14.023 rows=1 loops=758)"
" Filter: (fk7 = reklamacny_list.id)"
" Rows Removed by Filter: 28432"
" -> Materialize (cost=0.42..41.38 rows=18 width=4) (actual time=0.017..0.021 rows=5 loops=754)"
" -> Index Scan using dale_fk7_idx on dale d2 (cost=0.42..41.29 rows=18 width=4) (actual time=0.009..0.012 rows=5 loops=754)"
" Index Cond: (fk7 = reklamacny_list.id)"
" -> Sort (cost=1.14..1.15 rows=6 width=9) (actual time=0.015..0.015 rows=2 loops=754)"
" Sort Key: tr.id"
" Sort Method: quicksort Memory: 25kB"
" -> Seq Scan on typ_reklamacie tr (cost=0.00..1.06 rows=6 width=9) (actual time=0.003..0.006 rows=6 loops=754)"
" Filter: (validto IS NULL)"
" -> Hash (cost=2.74..2.74 rows=57 width=9) (actual time=0.092..0.092 rows=57 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 11kB"
" -> Seq Scan on stav_reklamacie sr (cost=0.00..2.74 rows=57 width=9) (actual time=0.032..0.068 rows=57 loops=1)"
" Filter: (validto IS NULL)"
" Rows Removed by Filter: 17"
"Planning Time: 1.556 ms"
"Execution Time: 10695.752 ms"
Run Code Online (Sandbox Code Playgroud)
在
10
"Limit (cost=0.29..163635.75 rows=100 width=4) (actual time=1.958..20.024 rows=100 loops=1)"
" -> Index Scan Backward using reklamacny_list_pk on reklamacny_list (cost=0.29..21326610.37 rows=13033 width=4) (actual time=1.957..20.011 rows=100 loops=1)"
" Index Cond: (id > 0)"
" Filter: ((validto IS NULL) AND (SubPlan 1))"
" Rows Removed by Filter: 572"
" SubPlan 1"
" -> Nested Loop (cost=4.22..781.03 rows=1293 width=1) (actual time=0.017..0.031 rows=1 loops=609)"
" -> Hash Join (cost=3.80..377.12 rows=76 width=5) (actual time=0.004..0.004 rows=1 loops=609)"
" Hash Cond: (d2.fk3 = sr.id)"
" -> Index Scan using dale_fk7_idx on dale d2 (cost=0.42..372.47 rows=100 width=4) (actual time=0.002..0.003 rows=5 loops=609)"
" Index Cond: (fk7 = reklamacny_list.id)"
" -> Hash (cost=2.71..2.71 rows=54 width=9) (actual time=0.037..0.037 rows=54 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 11kB"
" -> Seq Scan on stav_reklamacie sr (cost=0.00..2.71 rows=54 width=9) (actual time=0.009..0.023 rows=54 loops=1)"
" Filter: (validto IS NULL)"
" Rows Removed by Filter: 17"
" -> Materialize (cost=0.42..374.87 rows=17 width=24) (actual time=0.013..0.025 rows=1 loops=604)"
" -> Nested Loop (cost=0.42..374.78 rows=17 width=24) (actual time=0.010..0.022 rows=1 loops=604)"
" Join Filter: (d1.fk1 = tr.id)"
" Rows Removed by Join Filter: 31"
" -> Seq Scan on typ_reklamacie tr (cost=0.00..1.06 rows=1 width=28) (actual time=0.001..0.002 rows=6 loops=604)"
" Filter: (validto IS NULL)"
" -> Index Scan using dale_fk7_idx on dale d1 (cost=0.42..372.47 rows=100 width=4) (actual time=0.002..0.003 rows=5 loops=3624)"
" Index Cond: (fk7 = reklamacny_list.id)"
"Planning time: 1.418 ms"
"Execution time: 20.193 ms"
Run Code Online (Sandbox Code Playgroud)
我在 zip 文件中附加完整日志(包括配置 postgres.conf)。似乎提高默认统计目标有帮助,但仅限于非常高的值。
https://www.dropbox.com/s/7m3wy9nkapqitca/SpeedTest.zip?dl=0
小智 5
默认postgresql.conf
配置仅适用于占用空间小的数据库,如果您的数据库很大并且使用复杂的连接进行查询,它会很慢。
我从您的 postgresql10 配置文件中看到您的共享内存仅设置为128MB
(还有许多其他设置非常小)。您需要重新配置它。
调整 PostgreSQL 服务器是一个很大的话题,不同类型的硬件也需要不同的设置,这也伴随着反复试验,同时不断改进设置/配置。
我无法在这里讨论整个主题,我只能提供我曾经使用的调整设置。
max_connections : 800
shared_buffers : 1536MB
work_mem : 24MB
maintenance_work_mem : 480MB
vacuum_cost_delay : 20ms
synchronous_commit : local
wal_buffers : 8MB
max_wal_size : 1536GB
checkpoint_completion_target : 0.9
effective_cache_size : 4GB
deadlock_timeout : 3s
log_min_duration_statement : 5000
log_error_verbosity : verbose
log_autovacuum_min_duration : 10000
log_lock_waits : on
Run Code Online (Sandbox Code Playgroud)
psql 10:缓冲区:共享命中=4.439.193
psql 09:缓冲区:共享命中=____7.493 读取=686
https://www.postgresql.org/docs/11/sql-explain.html
包括有关缓冲区使用情况的信息。具体来说,包括共享块命中、读取、脏化和写入的数量,本地块命中、读取、脏化和写入的数量,以及读取和写入的临时块的数量。命中意味着避免了读取,因为在需要时已经在缓存中找到了该块。共享块包含来自常规表和索引的数据;本地块包含来自临时表和索引的数据;而临时块包含用于排序、哈希、Materialize 计划节点和类似情况的短期工作数据。脏块数表示之前未修改的块被该查询更改的数量;而写入的块数表示该后端在查询处理期间从缓存中逐出的先前脏块的数量。上层节点显示的块数包括其所有子节点使用的块数。在文本格式中,仅打印非零值。仅当 ANALYZE 也启用时才可以使用此参数。它默认为 FALSE。
p09 上的 db 有问题,p10 上有大约 8000 个缓冲区参与,有 4M 缓冲区命中
如果这是在同一个数据集上,那么数据中一定有很多空/删除的行。
如果是这种情况,那么真空应该会有所帮助。
执行计划有很大不同,所涉及的行数的估计也不同,特定列上的更新统计数据可能会有所帮助
https://www.citusdata.com/blog/2018/03/06/postgres-planner-and-its-usage-of-statistics/
psql 09 计划没有单独的排序部分。也许索引不一样。psql 09 可能已经以正确的顺序获取日期......