在视图中封装Postgres查询会使其非常慢

Question

在视图中封装Postgres查询会使其非常慢

EMP*_*EMP 5 postgresql performance view window-functions

我有一个查询,在Postgres 8.4上运行大约5秒钟.它从连接到其他表的视图中选择数据,但也使用lag()窗口函数,即.

SELECT *, lag(column1) OVER (PARTITION BY key1 ORDER BY ...), lag(...)
FROM view1 v
JOIN othertables USING (...)
WHERE ...

Run Code Online (Sandbox Code Playgroud)

为方便起见,我创建了一个简单的新视图

SELECT *, lag(column1) OVER (PARTITION BY key1 ORDER BY ...), lag(...)
FROM view1 v

Run Code Online (Sandbox Code Playgroud)

然后从中使用SELECT,使用所有其他JOIN和过滤器.令我惊讶的是,这个查询在12分钟内没有完成(我在那时停止了).Postgres显然选择了不同的执行计划.我怎么能不这样做,即.使用与原始查询中相同的计划？我本以为视图不应该改变执行计划,但显然它确实如此.

编辑:更重要的是,我发现即使我将第一个视图的内容复制到第二个视图中,它仍然不会返回.

编辑2:好的,我已经充分简化了查询以发布计划.

使用视图(这不会在任何合理的时间内返回):

Subquery Scan sp  (cost=5415201.23..5892463.97 rows=88382 width=370)
  Filter: (((sp.ticker)::text ~~ 'Some Ticker'::text) AND (sp.price_date >= '2010-06-01'::date))
  ->  WindowAgg  (cost=5415201.23..5680347.20 rows=53029193 width=129)
        ->  Sort  (cost=5415201.23..5441715.83 rows=53029193 width=129)
              Sort Key: sp.stock_id, sp.price_date
              ->  Hash Join  (cost=847.87..1465139.61 rows=53029193 width=129)
                    Hash Cond: (sp.stock_id = s.stock_id)
                    ->  Seq Scan on stock_prices sp  (cost=0.00..1079829.20 rows=53029401 width=115)
                    ->  Hash  (cost=744.56..744.56 rows=29519 width=18)
                          ->  Seq Scan on stocks s  (cost=0.00..744.56 rows=29519 width=18)

Run Code Online (Sandbox Code Playgroud)

将窗口函数从视图中移出并放入查询本身(这会立即返回):

WindowAgg  (cost=34.91..34.95 rows=7 width=129)
  ->  Sort  (cost=34.91..34.92 rows=7 width=129)
        Sort Key: sp.stock_id, sp.price_date
        ->  Nested Loop  (cost=0.00..34.89 rows=7 width=129)
              ->  Index Scan using stocks_ticker_unique on stocks s  (cost=0.00..4.06 rows=1 width=18)
                    Index Cond: ((ticker)::text = 'Some Ticker'::text)
                    Filter: ((ticker)::text ~~ 'Some Ticker'::text)
              ->  Index Scan using stock_prices_id_date_idx on stock_prices sp  (cost=0.00..30.79 rows=14 width=115)
                    Index Cond: ((sp.stock_id = s.stock_id) AND (sp.price_date >= '2010-06-01'::date))

Run Code Online (Sandbox Code Playgroud)

因此,似乎在慢速情况下,它首先尝试将窗口函数应用于所有数据,然后对其进行过滤,这可能就是问题所在.不过,我不知道为什么会这样做.

Answer 1

Den*_*rdy 2

两个计划之间的差异来自于聚合的加入。这可以防止使用嵌套循环计划。当你在你的观点中使用聚合时，你就把自己置于不利的境地。

例如，这几乎总是会导致两个表上的合并或散列连接计划，然后进行前 n 排序：

select foo.*
from foo
join (select bar.* from bar group by bar.field) as bar on foo.field = bar.field
where ...
order by bar.field
limit 10;

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年，6 月前
查看次数：	6063 次
最近记录：	12 年，7 月前