Ima*_* Y. 5 postgresql performance order-by postgresql-10 postgresql-performance
语境:
PostgreSQL 10,users表有3667438条记录,users表有一个名为social的JSONB,我们通常使用对计算函数输出进行索引的策略,这样我们就可以将信息聚合到一个单独的索引中。的输出engagement(social)
函数是双精度数字类型。
问题:
有问题的条款是 ORDER BY engagement(social) DESC NULLS LAST
,还有一个 btree 索引idx_in_social_engagement with DESC NULLS LAST
附加到这个数据。
快速查询:
EXPLAIN ANALYZE
SELECT "users".* FROM "users"
WHERE (follower_count(social) < 500000)
AND (engagement(social) > 0.03)
AND (engagement(social) < 0.25)
AND (peemv(social) < 533)
ORDER BY "users"."created_at" ASC
LIMIT 12 OFFSET 0;
Limit (cost=0.43..52.25 rows=12 width=1333) (actual time=0.113..1.625
rows=12 loops=1)
-> Index Scan using created_at_idx on users (cost=0.43..7027711.55 rows=1627352 width=1333) (actual time=0.112..1.623 rows=12 loops=1)
Filter: ((follower_count(social) < 500000) AND (engagement(social) > '0.03'::double precision) AND (engagement(social) < '0.25'::double precision) AND (peemv(social) > '0'::double precision) AND (peemv(social) < '533'::double precision))
Rows Removed by Filter: 8
Planning time: 0.324 ms
Execution time: 1.639 ms
Run Code Online (Sandbox Code Playgroud)
慢查询:
EXPLAIN ANALYZE
SELECT "users".* FROM "users"
WHERE (follower_count(social) < 500000)
AND (engagement(social) > 0.03)
AND (engagement(social) < 0.25)
AND (peemv(social) > 0.0)
AND (peemv(social) < 533)
ORDER BY engagement(social) DESC NULLS LAST, "users"."created_at" ASC
LIMIT 12 OFFSET 0;
Limit (cost=2884438.00..2884438.03 rows=12 width=1341) (actual time=68011.728..68011.730 rows=12 loops=1)
-> Sort (cost=2884438.00..2888506.38 rows=1627352 width=1341) (actual time=68011.727..68011.728 rows=12 loops=1)
Sort Key: (engagement(social)) DESC NULLS LAST, created_at
Sort Method: top-N heapsort Memory: 45kB
-> Index Scan using idx_in_social_engagement on users (cost=0.43..2847131.26 rows=1627352 width=1341) (actual time=0.082..67019.102 rows=1360633 loops=1)
Index Cond: ((engagement(social) > '0.03'::double precision) AND (engagement(social) < '0.25'::double precision))
Filter: ((follower_count(social) < 500000) AND (peemv(social) > '0'::double precision) AND (peemv(social) < '533'::double precision))
Rows Removed by Filter: 85580
Planning time: 0.312 ms
Execution time: 68011.752 ms
Run Code Online (Sandbox Code Playgroud)
选择带有 * 因为我需要存储在每一行中的所有数据。
更新:
CREATE INDEX idx_in_social_engagement on influencers USING BTREE ( engagement(social) DESC NULLS LAST)
Run Code Online (Sandbox Code Playgroud)
精确的索引定义
你的ORDER BY
条款是:
engagement(social) DESC NULLS LAST, "users"."created_at" ASC
Run Code Online (Sandbox Code Playgroud)
但我怀疑你的索引只是在:
engagement(social) DESC NULLS LAST
Run Code Online (Sandbox Code Playgroud)
所以索引不能完全支持 ORDER BY
.
您可以在不使用JSONB
或 表达式索引的情况下重现相同的问题。您可以通过在您的两列上创建复合表达式索引来挽救这种情况ORDER BY
。
如果 PostgreSQL 规划器是无限明智的,它可能能够有效地使用现有索引。它必须继续前进,engagement(social) DESC NULLS LAST
直到它收集到 12 个满足所有其余过滤器要求的元组。然后它会继续向上移动该索引,直到它收集到engagement(social)
与第 12 个元组相关的所有其余元组(并且满足其他标准)。然后它必须在 full 上重新排序所有收集的元组ORDER BY
,并将 应用于LIMIT 12
扩展和重新排序的集合。但是 PostgreSQL 规划器并不是无限明智的。