Iai*_*der 21 postgresql optimization postgresql-9.3
我写了两个函数来回答七周内七个数据库中第 3 天的第一个作业问题。
创建一个存储过程,您可以在其中输入您喜欢的电影名称或演员的名字,它会根据演员主演的电影或类似类型的电影返回前五名的建议。
我的第一次尝试是正确的,但速度很慢。返回结果最多可能需要 2000 毫秒。
CREATE OR REPLACE FUNCTION suggest_movies(IN query text, IN result_limit integer DEFAULT 5)
RETURNS TABLE(movie_id integer, title text) AS
$BODY$
WITH suggestions AS (
SELECT
actors.name AS entity_term,
movies.movie_id AS suggestion_id,
movies.title AS suggestion_title,
1 AS rank
FROM actors
INNER JOIN movies_actors ON (actors.actor_id = movies_actors.actor_id)
INNER JOIN movies ON (movies.movie_id = movies_actors.movie_id)
UNION ALL
SELECT
searches.title AS entity_term,
suggestions.movie_id AS suggestion_id,
suggestions.title AS suggestion_title,
RANK() OVER (PARTITION BY searches.movie_id ORDER BY cube_distance(searches.genre, suggestions.genre)) AS rank
FROM movies AS searches
INNER JOIN movies AS suggestions ON
(searches.movie_id <> suggestions.movie_id) AND
(cube_enlarge(searches.genre, 2, 18) @> suggestions.genre)
)
SELECT suggestion_id, suggestion_title
FROM suggestions
WHERE entity_term = query
ORDER BY rank, suggestion_id
LIMIT result_limit;
$BODY$
LANGUAGE sql;
Run Code Online (Sandbox Code Playgroud)
我的第二次尝试是正确且快速的。我通过将过滤器从 CTE 向下推到联合的每个部分来优化它。
我从外部查询中删除了这一行:
WHERE entity_term = query
Run Code Online (Sandbox Code Playgroud)
我将此行添加到第一个内部查询中:
WHERE actors.name = query
Run Code Online (Sandbox Code Playgroud)
我将此行添加到第二个内部查询中:
WHERE movies.title = query
Run Code Online (Sandbox Code Playgroud)
第二个函数大约需要 10 毫秒才能返回相同的结果。
除了函数定义之外,数据库中没有任何不同。
为什么 PostgreSQL 会为这两个逻辑上等价的查询生成如此不同的计划?
第EXPLAIN ANALYZE一个函数的计划如下所示:
Limit (cost=7774.18..7774.19 rows=5 width=44) (actual time=1738.566..1738.567 rows=5 loops=1)
CTE suggestions
-> Append (cost=332.56..7337.19 rows=19350 width=285) (actual time=7.113..1577.823 rows=383024 loops=1)
-> Subquery Scan on "*SELECT* 1" (cost=332.56..996.80 rows=11168 width=33) (actual time=7.113..22.258 rows=11168 loops=1)
-> Hash Join (cost=332.56..885.12 rows=11168 width=33) (actual time=7.110..19.850 rows=11168 loops=1)
Hash Cond: (movies_actors.movie_id = movies.movie_id)
-> Hash Join (cost=143.19..514.27 rows=11168 width=18) (actual time=4.326..11.938 rows=11168 loops=1)
Hash Cond: (movies_actors.actor_id = actors.actor_id)
-> Seq Scan on movies_actors (cost=0.00..161.68 rows=11168 width=8) (actual time=0.013..1.648 rows=11168 loops=1)
-> Hash (cost=80.86..80.86 rows=4986 width=18) (actual time=4.296..4.296 rows=4986 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 252kB
-> Seq Scan on actors (cost=0.00..80.86 rows=4986 width=18) (actual time=0.009..1.681 rows=4986 loops=1)
-> Hash (cost=153.61..153.61 rows=2861 width=19) (actual time=2.768..2.768 rows=2861 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 146kB
-> Seq Scan on movies (cost=0.00..153.61 rows=2861 width=19) (actual time=0.003..1.197 rows=2861 loops=1)
-> Subquery Scan on "*SELECT* 2" (cost=6074.48..6340.40 rows=8182 width=630) (actual time=1231.324..1528.188 rows=371856 loops=1)
-> WindowAgg (cost=6074.48..6258.58 rows=8182 width=630) (actual time=1231.324..1492.106 rows=371856 loops=1)
-> Sort (cost=6074.48..6094.94 rows=8182 width=630) (actual time=1231.307..1282.550 rows=371856 loops=1)
Sort Key: searches.movie_id, (cube_distance(searches.genre, suggestions_1.genre))
Sort Method: external sort Disk: 21584kB
-> Nested Loop (cost=0.27..3246.72 rows=8182 width=630) (actual time=0.047..909.096 rows=371856 loops=1)
-> Seq Scan on movies searches (cost=0.00..153.61 rows=2861 width=315) (actual time=0.003..0.676 rows=2861 loops=1)
-> Index Scan using movies_genres_cube on movies suggestions_1 (cost=0.27..1.05 rows=3 width=315) (actual time=0.016..0.277 rows=130 loops=2861)
Index Cond: (cube_enlarge(searches.genre, 2::double precision, 18) @> genre)
Filter: (searches.movie_id <> movie_id)
Rows Removed by Filter: 1
-> Sort (cost=436.99..437.23 rows=97 width=44) (actual time=1738.565..1738.566 rows=5 loops=1)
Sort Key: suggestions.rank, suggestions.suggestion_id
Sort Method: top-N heapsort Memory: 25kB
-> CTE Scan on suggestions (cost=0.00..435.38 rows=97 width=44) (actual time=1281.905..1738.531 rows=43 loops=1)
Filter: (entity_term = 'Die Hard'::text)
Rows Removed by Filter: 382981
Total runtime: 1746.623 ms
Run Code Online (Sandbox Code Playgroud)
EXPLAIN ANALYZE第二个查询的计划如下所示:
Limit (cost=43.74..43.76 rows=5 width=44) (actual time=1.231..1.234 rows=5 loops=1)
CTE suggestions
-> Append (cost=4.86..43.58 rows=5 width=391) (actual time=1.029..1.141 rows=43 loops=1)
-> Subquery Scan on "*SELECT* 1" (cost=4.86..20.18 rows=2 width=33) (actual time=0.047..0.047 rows=0 loops=1)
-> Nested Loop (cost=4.86..20.16 rows=2 width=33) (actual time=0.047..0.047 rows=0 loops=1)
-> Nested Loop (cost=4.58..19.45 rows=2 width=18) (actual time=0.045..0.045 rows=0 loops=1)
-> Index Scan using actors_name on actors (cost=0.28..8.30 rows=1 width=18) (actual time=0.045..0.045 rows=0 loops=1)
Index Cond: (name = 'Die Hard'::text)
-> Bitmap Heap Scan on movies_actors (cost=4.30..11.13 rows=2 width=8) (never executed)
Recheck Cond: (actor_id = actors.actor_id)
-> Bitmap Index Scan on movies_actors_actor_id (cost=0.00..4.30 rows=2 width=0) (never executed)
Index Cond: (actor_id = actors.actor_id)
-> Index Scan using movies_pkey on movies (cost=0.28..0.35 rows=1 width=19) (never executed)
Index Cond: (movie_id = movies_actors.movie_id)
-> Subquery Scan on "*SELECT* 2" (cost=23.31..23.40 rows=3 width=630) (actual time=0.982..1.081 rows=43 loops=1)
-> WindowAgg (cost=23.31..23.37 rows=3 width=630) (actual time=0.982..1.064 rows=43 loops=1)
-> Sort (cost=23.31..23.31 rows=3 width=630) (actual time=0.963..0.971 rows=43 loops=1)
Sort Key: searches.movie_id, (cube_distance(searches.genre, suggestions_1.genre))
Sort Method: quicksort Memory: 28kB
-> Nested Loop (cost=4.58..23.28 rows=3 width=630) (actual time=0.808..0.916 rows=43 loops=1)
-> Index Scan using movies_title on movies searches (cost=0.28..8.30 rows=1 width=315) (actual time=0.025..0.027 rows=1 loops=1)
Index Cond: (title = 'Die Hard'::text)
-> Bitmap Heap Scan on movies suggestions_1 (cost=4.30..14.95 rows=3 width=315) (actual time=0.775..0.844 rows=43 loops=1)
Recheck Cond: (cube_enlarge(searches.genre, 2::double precision, 18) @> genre)
Filter: (searches.movie_id <> movie_id)
Rows Removed by Filter: 1
-> Bitmap Index Scan on movies_genres_cube (cost=0.00..4.29 rows=3 width=0) (actual time=0.750..0.750 rows=44 loops=1)
Index Cond: (cube_enlarge(searches.genre, 2::double precision, 18) @> genre)
-> Sort (cost=0.16..0.17 rows=5 width=44) (actual time=1.230..1.231 rows=5 loops=1)
Sort Key: suggestions.rank, suggestions.suggestion_id
Sort Method: top-N heapsort Memory: 25kB
-> CTE Scan on suggestions (cost=0.00..0.10 rows=5 width=44) (actual time=1.034..1.187 rows=43 loops=1)
Total runtime: 1.410 ms
Run Code Online (Sandbox Code Playgroud)
Iai*_*der 25
PostgreSQL 9.3 不为 CTE做谓词下推。
执行谓词下推的优化器可以将 where 子句移动到内部查询中。目标是尽早过滤掉不相关的数据。只要新查询在逻辑上是等效的,引擎仍会获取所有相关数据,因此会生成正确的结果,只是速度更快。
核心开发人员 Tom Lane 在pgsql-performance 邮件列表中提到了确定逻辑等效性的困难。
CTE 也被视为优化栅栏;这与其说是优化器限制,不如说是在 CTE 包含可写查询时保持语义健全。
优化器不区分只读 CTE 和可写 CTE,因此在考虑计划时过于保守。“围栏”处理阻止优化器在 CTE 内移动 where 子句,尽管我们可以看到这样做是安全的。
我们可以等待 PostgreSQL 团队改进 CTE 优化,但现在要获得良好的性能,您必须改变您的写作风格。
这个问题已经展示了一种获得更好计划的方法。复制过滤条件实质上是对谓词下推的效果进行硬编码。
在这两个计划中,引擎将结果行复制到工作表中,以便对它们进行排序。工作表越大,查询越慢。
第一个计划将基表中的所有行复制到工作表并扫描它以找到结果。为了使事情更慢,引擎必须扫描整个工作表,因为它没有索引。
这是一个荒谬的不必要的工作。它读取基表中的所有数据两次以找到答案,当基表中估计的 19350 行中只有估计的 5 个匹配行时。
第二个计划使用索引来查找匹配的行并将这些行复制到工作表中。该索引有效地为我们过滤了数据。
在The Art of SQL 的第 85 页上,Stéphane Faroult 提醒我们用户的期望。
在很大程度上,最终用户会根据他们期望的行数调整他们的耐心:当他们要求一根针时,他们很少注意大海捞针的大小。
第二个计划随针扩展,因此更有可能让您的用户满意。
新查询更难维护,因为您可以通过更改一个过滤器表达式而不是另一个来引入缺陷。
如果我们可以只编写一次并且仍然获得良好的性能,那不是很好吗?
我们可以。优化器为子查询做谓词下推。
一个更简单的例子更容易解释。
CREATE TABLE a (c INT);
CREATE TABLE b (c INT);
CREATE INDEX a_c ON a(c);
CREATE INDEX b_c ON b(c);
INSERT INTO a SELECT 1 FROM generate_series(1, 1000000);
INSERT INTO b SELECT 2 FROM a;
INSERT INTO a SELECT 3;
Run Code Online (Sandbox Code Playgroud)
这将创建两个表,每个表都有一个索引列。它们一起包含一百万1、一百万2和一个3。
您可以3使用这些查询中的任何一个来找到针。
-- CTE
EXPLAIN ANALYZE
WITH cte AS (
SELECT c FROM a
UNION ALL
SELECT c FROM b
)
SELECT c FROM cte WHERE c = 3;
-- Subquery
EXPLAIN ANALYZE
SELECT c
FROM (
SELECT c FROM a
UNION ALL
SELECT c FROM b
) AS subquery
WHERE c = 3;
Run Code Online (Sandbox Code Playgroud)
CTE 的计划进展缓慢。引擎扫描三个表并读取大约四百万行。它需要将近 1000 毫秒。
CTE Scan on cte (cost=33275.00..78275.00 rows=10000 width=4) (actual time=471.412..943.225 rows=1 loops=1)
Filter: (c = 3)
Rows Removed by Filter: 2000000
CTE cte
-> Append (cost=0.00..33275.00 rows=2000000 width=4) (actual time=0.011..409.573 rows=2000001 loops=1)
-> Seq Scan on a (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.010..114.869 rows=1000001 loops=1)
-> Seq Scan on b (cost=0.00..18850.00 rows=1000000 width=4) (actual time=5.530..104.674 rows=1000000 loops=1)
Total runtime: 948.594 ms
Run Code Online (Sandbox Code Playgroud)
子查询的计划很快。引擎只是寻找每个索引。它需要不到一毫秒。
Append (cost=0.42..8.88 rows=2 width=4) (actual time=0.021..0.038 rows=1 loops=1)
-> Index Only Scan using a_c on a (cost=0.42..4.44 rows=1 width=4) (actual time=0.020..0.021 rows=1 loops=1)
Index Cond: (c = 3)
Heap Fetches: 1
-> Index Only Scan using b_c on b (cost=0.42..4.44 rows=1 width=4) (actual time=0.016..0.016 rows=0 loops=1)
Index Cond: (c = 3)
Heap Fetches: 0
Total runtime: 0.065 ms
Run Code Online (Sandbox Code Playgroud)
有关交互式版本,请参阅SQLFiddle。
该问题询问的是 Postgres 9.3。五年后,该版本已过时,但发生了什么变化?
PostgreSQL 12现在内联了这样的 CTE。
内联WITH查询(公共表表达式)
现在,如果公共表表达式(又名
WITH查询)a) 不是递归的,b) 没有任何副作用,并且 c) 仅在查询的后续部分中引用一次,则它们现在可以自动内联到查询中。WITH这消除了自 PostgreSQL 8.4 中引入该子句以来一直存在的“优化栅栏”如果需要,您可以使用 MATERIALIZED 子句强制WITH查询具体化,例如
Run Code Online (Sandbox Code Playgroud)WITH c AS MATERIALIZED ( SELECT * FROM a WHERE a.x % 4 = 0 ) SELECT * FROM c JOIN d ON d.y = a.x;