Ame*_*lle 5 postgresql cte view
我试图从我的数据库中获取用户统计信息的加权总和,并且我一次只会查询一个或两个用户的表,所以我将它写为一个视图。
由于它是一个视图,我假装计算表中每一行的总和,然后我希望优化器能够意识到当我只要求一行并优化查询时。然而,我的查询计划非常庞大,并且在其最里面计算了 170 亿行,我认为最多应该有 1000 行。
这是查询:
CREATE OR REPLACE VIEW weighted_stats AS
WITH
clf AS (SELECT * FROM classifiers order by time_trained desc limit 1),
weights AS (SELECT kv.key, kv.value from clf, each(clf.weights) AS kv),
kvs AS (
SELECT stats.player_id, kv.key, kv.value FROM
stats, each(stats.hstore_column) AS kv),
SELECT
stats.player_id,
SUM(kvs.value :: numeric * weights.value :: numeric) AS stats
FROM
kvs JOIN weights USING (key)
GROUP BY kvs.player_id;
Run Code Online (Sandbox Code Playgroud)
这是查询计划:
explain analyze select * from weighted_stats where player_id=76561197960269296
GroupAggregate (cost=53645.35..299471.72 rows=1 width=72) (actual time=1014.016..1014.016 rows=0 loops=1)
Group Key: kvs.id
CTE clf
-> Limit (cost=20.65..20.65 rows=1 width=84) (actual time=0.017..0.018 rows=1 loops=1)
-> Sort (cost=20.65..22.43 rows=710 width=84) (actual time=0.014..0.014 rows=1 loops=1)
Sort Key: classifiers.time_trained
Sort Method: quicksort Memory: 25kB
-> Seq Scan on classifiers (cost=0.00..17.10 rows=710 width=84) (actual time=0.003..0.005 rows=1 loops=1)
CTE kvs
-> Seq Scan on stats (cost=0.00..53572.18 rows=10318000 width=722) (actual time=0.037..530.337 rows=336036 loops=1)
CTE weights
-> Nested Loop (cost=0.00..20.02 rows=1000 width=64) (actual time=0.036..0.046 rows=2 loops=1)
-> CTE Scan on clf (cost=0.00..0.02 rows=1 width=32) (actual time=0.020..0.023 rows=1 loops=1)
-> Function Scan on each kv (cost=0.00..10.00 rows=1000 width=64) (actual time=0.011..0.013 rows=2 loops=1)
-> Hash Join (cost=32.50..241344.73 rows=257950 width=72) (actual time=1014.012..1014.012 rows=0 loops=1)
Hash Cond: (kvs.key = weights.key)
-> CTE Scan on kvs (cost=0.00..232155.00 rows=51590 width=72) (actual time=0.044..1013.877 rows=62 loops=1)
Filter: (id = 76561197960269296::bigint)
Rows Removed by Filter: 335974
-> Hash (cost=20.00..20.00 rows=1000 width=64) (actual time=0.060..0.060 rows=2 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> CTE Scan on weights (cost=0.00..20.00 rows=1000 width=64) (actual time=0.040..0.054 rows=2 loops=1)
Planning time: 0.286 ms
Execution time: 1017.671 ms
Run Code Online (Sandbox Code Playgroud)
这仍然比我预期的要慢得多。优化部分是通过在加入之前过滤而不是在分组之前过滤来工作的,但似乎 kvs CTE(它本身应该被过滤)仍在为每个人计算。
PostgreSQL 将公共表表达式视为“优化栅栏”:它永远不会将谓词从主查询下推到 CTE,也不会折叠任何跨 CTE 边界的连接。相反,它通常会按原样评估整个 CTE,实现结果;然后主查询将访问从 CTE 生成的临时表。
所以是的,您的查询可能会从将 CTE 转换为子查询中受益。
需要注意的是一个实际的视图(由CREATE VIEW创建)并不能起到优化栅栏。视图的定义将包含在使用它的查询中,然后像往常一样优化。对于 CTE,已经讨论过将优化栅栏行为设为可选,以便它们可以“仅”用于使查询更具可读性。但是,从 9.5 版开始,这还没有实现。
| 归档时间: |
|
| 查看次数: |
1256 次 |
| 最近记录: |