Pau*_*ard 3 postgresql optimization execution-plan aws-aurora postgresql-performance
将应用程序及其数据库从经典 PostgreSQL 数据库迁移到 Amazon Aurora RDS PostgreSQL 数据库(均使用 9.6 版本)后,我们发现特定查询在 Aurora 上的运行速度要慢得多——大约慢 10 倍在 PostgreSQL 上。
两个数据库都具有相同的配置,无论是硬件还是 pg_conf。
查询本身相当简单。它是从我们用 Java 编写的后端生成的,并使用 jOOQ 编写查询:
with "all_acp_ids"("acp_id") as (
select acp_id from temp_table_de3398bacb6c4e8ca8b37be227eac089
)
select distinct "public"."f1_folio_milestones"."acp_id",
coalesce("public"."sa_milestone_overrides"."team",
"public"."f1_folio_milestones"."team_responsible")
from "public"."f1_folio_milestones"
left outer join
"public"."sa_milestone_overrides" on (
"public"."f1_folio_milestones"."milestone" = "public"."sa_milestone_overrides"."milestone"
and "public"."f1_folio_milestones"."view" = "public"."sa_milestone_overrides"."view"
and "public"."f1_folio_milestones"."acp_id" = "public"."sa_milestone_overrides"."acp_id"
)
where "public"."f1_folio_milestones"."acp_id" in (
select "all_acp_ids"."acp_id" from "all_acp_ids"
)
Run Code Online (Sandbox Code Playgroud)
用temp_table_de3398bacb6c4e8ca8b37be227eac089
是单个列的表,f1_folio_milestones
(17万个条目)和sa_milestone_overrides
(100万左右的条目)是具有在所有用于列索引类似设计的表LEFT OUTER JOIN
。
temp_table_de3398bacb6c4e8ca8b37be227eac089
最多可以包含 5000 个条目,所有条目都是不同的。
当我们在普通的 PostgreSQL 数据库上运行它时,它会生成以下查询计划:
Unique (cost=4802622.20..4868822.51 rows=8826708 width=43) (actual time=483.928..483.930 rows=1 loops=1)
CTE all_acp_ids
-> Seq Scan on temp_table_de3398bacb6c4e8ca8b37be227eac089 (cost=0.00..23.60 rows=1360 width=32) (actual time=0.004..0.005 rows=1 loops=1)
-> Sort (cost=4802598.60..4824665.37 rows=8826708 width=43) (actual time=483.927..483.927 rows=4 loops=1)
Sort Key: f1_folio_milestones.acp_id, (COALESCE(sa_milestone_overrides.team, f1_folio_milestones.team_responsible))
Sort Method: quicksort Memory: 25kB
-> Hash Left Join (cost=46051.06..3590338.34 rows=8826708 width=43) (actual time=483.905..483.917 rows=4 loops=1)
Hash Cond: ((f1_folio_milestones.milestone = sa_milestone_overrides.milestone) AND (f1_folio_milestones.view = (sa_milestone_overrides.view)::text) AND (f1_folio_milestones.acp_id = (sa_milestone_overrides.acp_id)::text))
-> Nested Loop (cost=31.16..2572.60 rows=8826708 width=37) (actual time=0.029..0.038 rows=4 loops=1)
-> HashAggregate (cost=30.60..32.60 rows=200 width=32) (actual time=0.009..0.010 rows=1 loops=1)
Group Key: all_acp_ids.acp_id
-> CTE Scan on all_acp_ids (cost=0.00..27.20 rows=1360 width=32) (actual time=0.006..0.007 rows=1 loops=1)
-> Index Scan using f1_folio_milestones_acp_id_idx on f1_folio_milestones (cost=0.56..12.65 rows=5 width=37) (actual time=0.018..0.025 rows=4 loops=1)
Index Cond: (acp_id = all_acp_ids.acp_id)
-> Hash (cost=28726.78..28726.78 rows=988178 width=34) (actual time=480.423..480.423 rows=987355 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 72580kB
-> Seq Scan on sa_milestone_overrides (cost=0.00..28726.78 rows=988178 width=34) (actual time=0.004..189.641 rows=987355 loops=1)
Planning time: 3.561 ms
Execution time: 489.223 ms
Run Code Online (Sandbox Code Playgroud)
正如人们所见,它进行得非常顺利——查询不到一秒钟。但是在 Aurora 实例上,会发生这种情况:
Unique (cost=2632927.29..2699194.83 rows=8835672 width=43) (actual time=4577.348..4577.350 rows=1 loops=1)
CTE all_acp_ids
-> Seq Scan on temp_table_de3398bacb6c4e8ca8b37be227eac089 (cost=0.00..23.60 rows=1360 width=32) (actual time=0.001..0.001 rows=1 loops=1)
-> Sort (cost=2632903.69..2654992.87 rows=8835672 width=43) (actual time=4577.348..4577.348 rows=4 loops=1)
Sort Key: f1_folio_milestones.acp_id, (COALESCE(sa_milestone_overrides.team, f1_folio_milestones.team_responsible))
Sort Method: quicksort Memory: 25kB
-> Merge Left Join (cost=1321097.58..1419347.08 rows=8835672 width=43) (actual time=4488.369..4577.330 rows=4 loops=1)
Merge Cond: ((f1_folio_milestones.view = (sa_milestone_overrides.view)::text) AND (f1_folio_milestones.milestone = sa_milestone_overrides.milestone) AND (f1_folio_milestones.acp_id = (sa_milestone_overrides.acp_id)::text))
-> Sort (cost=1194151.06..1216240.24 rows=8835672 width=37) (actual time=0.039..0.040 rows=4 loops=1)
Sort Key: f1_folio_milestones.view, f1_folio_milestones.milestone, f1_folio_milestones.acp_id
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=31.16..2166.95 rows=8835672 width=37) (actual time=0.022..0.028 rows=4 loops=1)
-> HashAggregate (cost=30.60..32.60 rows=200 width=32) (actual time=0.006..0.006 rows=1 loops=1)
Group Key: all_acp_ids.acp_id
-> CTE Scan on all_acp_ids (cost=0.00..27.20 rows=1360 width=32) (actual time=0.003..0.004 rows=1 loops=1)
-> Index Scan using f1_folio_milestones_acp_id_idx on f1_folio_milestones (cost=0.56..10.63 rows=4 width=37) (actual time=0.011..0.015 rows=4 loops=1)
Index Cond: (acp_id = all_acp_ids.acp_id)
-> Sort (cost=126946.52..129413.75 rows=986892 width=34) (actual time=4462.727..4526.822 rows=448136 loops=1)
Sort Key: sa_milestone_overrides.view, sa_milestone_overrides.milestone, sa_milestone_overrides.acp_id
Sort Method: quicksort Memory: 106092kB
-> Seq Scan on sa_milestone_overrides (cost=0.00..28688.92 rows=986892 width=34) (actual time=0.003..164.348 rows=986867 loops=1)
Planning time: 1.394 ms
Execution time: 4583.295 ms
Run Code Online (Sandbox Code Playgroud)
它实际上具有更低的全球成本,但所需的时间几乎是以前的 10 倍!
禁用合并连接会使 Aurora 恢复为散列连接,这提供了预期的执行时间——但永久禁用它不是一个选项。奇怪的是,禁用嵌套循环会在仍然使用合并连接的同时提供更好的结果......
Unique (cost=3610230.74..3676431.05 rows=8826708 width=43) (actual time=2.465..2.466 rows=1 loops=1)
CTE all_acp_ids
-> Seq Scan on temp_table_de3398bacb6c4e8ca8b37be227eac089 (cost=0.00..23.60 rows=1360 width=32) (actual time=0.004..0.004 rows=1 loops=1)
-> Sort (cost=3610207.14..3632273.91 rows=8826708 width=43) (actual time=2.464..2.464 rows=4 loops=1)
Sort Key: f1_folio_milestones.acp_id, (COALESCE(sa_milestone_overrides.team, f1_folio_milestones.team_responsible))
Sort Method: quicksort Memory: 25kB
-> Merge Left Join (cost=59.48..2397946.87 rows=8826708 width=43) (actual time=2.450..2.455 rows=4 loops=1)
Merge Cond: (f1_folio_milestones.acp_id = (sa_milestone_overrides.acp_id)::text)
Join Filter: ((f1_folio_milestones.milestone = sa_milestone_overrides.milestone) AND (f1_folio_milestones.view = (sa_milestone_overrides.view)::text))
-> Merge Join (cost=40.81..2267461.88 rows=8826708 width=37) (actual time=2.312..2.317 rows=4 loops=1)
Merge Cond: (f1_folio_milestones.acp_id = all_acp_ids.acp_id)
-> Index Scan using f1_folio_milestones_acp_id_idx on f1_folio_milestones (cost=0.56..2223273.29 rows=17653416 width=37) (actual time=0.020..2.020 rows=1952 loops=1)
-> Sort (cost=40.24..40.74 rows=200 width=32) (actual time=0.011..0.012 rows=1 loops=1)
Sort Key: all_acp_ids.acp_id
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=30.60..32.60 rows=200 width=32) (actual time=0.008..0.008 rows=1 loops=1)
Group Key: all_acp_ids.acp_id
-> CTE Scan on all_acp_ids (cost=0.00..27.20 rows=1360 width=32) (actual time=0.005..0.005 rows=1 loops=1)
-> Materialize (cost=0.42..62167.38 rows=987968 width=34) (actual time=0.021..0.101 rows=199 loops=1)
-> Index Scan using sa_milestone_overrides_acp_id_index on sa_milestone_overrides (cost=0.42..59697.46 rows=987968 width=34) (actual time=0.019..0.078 rows=199 loops=1)
Planning time: 5.500 ms
Execution time: 2.516 ms
Run Code Online (Sandbox Code Playgroud)
我们已询问 AWS 支持团队,他们仍在调查此问题,但我们想知道是什么原因导致该问题发生。什么可以解释这种行为差异?
在查看数据库的一些文档时,我读到 Aurora 倾向于随时间推移成本——因此它使用成本最低的查询计划。
但是正如我们所看到的,考虑到它的响应时间,它远非最佳......是否有一个阈值或设置可以使数据库使用更昂贵但更快的查询计划?
一个问题很突出(在所有查询计划中)并且很容易解决:
对 temp_table_de3398bacb6c4e8ca8b37be227eac089 进行 Seq 扫描(成本=0.00..23.60行=1360宽度=32)(实际时间=0.004..0.005行=1循环=1)
大胆强调我的。说,Postgres 期望该表中有1360行,但只找到1 行。
你评论说:
这是一个普通的表,在一切都完成后会被删除。[...] 查询计划是用表中的一个值完成的,但它总共可以有多达 5000 个条目,所有条目都是不同的。
.. 这可以解释完全误导的预期行数。统计数据由autovacuum保持最新。但是它需要一些时间来启动。如果您在填充(或大量修改)这样的表后立即运行一个复杂的查询,那么至少ANALYZE
在两者之间手动运行是明智的:
ANALYZE temp_table_de3398bacb6c4e8ca8b37be227eac089;
Run Code Online (Sandbox Code Playgroud)
甚至可能VACUUM ANALYZE
,但这不能在事务中运行。
实际的临时表( CREATE TEMP TABLE ...
)似乎更适合您的用例。(而其他会话不需要查看表的相同状态。)整体性能更好。但值得一提的是,这些完全不受 autovacuum 分析。看:
如果失败,Postgres 会尝试各种不恰当的查询计划,基于核心表中完全误导的行估计。Postgres 选择不同的查询计划有很多可能的原因——任何改变成本估算的东西。但是解决这个问题,手头的问题很可能就会消失。
Postgres 查询计划中的“成本”是估计时间(以任意单位表示)。
“实际时间”是事后测量的时间。
Postgres总是倾向于降低成本。这就是它决定选择哪个计划的方式。与 Aurora 没有任何关系。您的主要问题是误导性统计数据 --> 误导性行估计 --> 误导性成本估计。手册中的详细信息。从这里和这里开始
也就是说,我会简化查询。CTE 不会为查询添加任何有用的东西 - 无论是否具体化。去掉它。
SELECT DISTINCT m.acp_id,
COALESCE(o.team, m.team_responsible)
FROM temp_table_de3398bacb6c4e8ca8b37be227eac089 t
JOIN public.f1_folio_milestones m USING (acp_id)
LEFT JOIN public.sa_milestone_overrides o USING (milestone, view, acp_id);
Run Code Online (Sandbox Code Playgroud)
USING
只是为了节省打字的方便。没有性能影响。仅在适用的情况下使用。
鉴于您的基数(最多 5000 行t
,但 1700 万行m
),我会尝试使用横向子查询进行此替代查询:
SELECT t.acp_id, om.team
FROM temp_table_de3398bacb6c4e8ca8b37be227eac089 t
CROSS JOIN LATERAL (
SELECT COALESCE(o.team, m.team_responsible) AS team
FROM public.f1_folio_milestones m
LEFT JOIN public.sa_milestone_overrides o USING (milestone, view, acp_id)
WHERE m.acp_id = t.acp_id
) om;
Run Code Online (Sandbox Code Playgroud)
有了有利的数据分布和拟合索引,它可能会快得多。进一步阅读(如果你想优化推荐!):