Ver*_*gen 13 sql postgresql window-functions
是否可以将多个窗口函数应用于同一分区?(如果我没有使用正确的词汇,请纠正我)
例如,你可以做到
SELECT name, first_value() over (partition by name order by date) from table1
Run Code Online (Sandbox Code Playgroud)
但有没有办法做一些事情:
SELECT name, (first_value() as f, last_value() as l (partition by name order by date)) from table1
Run Code Online (Sandbox Code Playgroud)
我们在同一个窗口上应用两个函数的位置?
参考:http: //postgresql.ro/docs/8.4/static/tutorial-window.html
Adr*_*der 22
你能不能只根据选择使用窗口
就像是
SELECT name,
first_value() OVER (partition by name order by date) as f,
last_value() OVER (partition by name order by date) as l
from table1
Run Code Online (Sandbox Code Playgroud)
另外从你的参考,你可以这样做
SELECT sum(salary) OVER w, avg(salary) OVER w
FROM empsalary
WINDOW w AS (PARTITION BY depname ORDER BY salary DESC)
Run Code Online (Sandbox Code Playgroud)
Ski*_*rou 15
警告:因为它似乎技术上是正确的,因此可能是有帮助的,我不删除这个答案,但要注意这PARTITION BY bar ORDER BY foo是可能不是你想要做什么呢.实际上,聚合函数不会整体计算分区元素.也就是说,SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo) 不等同于SELECT avg(foo) OVER (PARTITION BY bar)(见答案末尾的证明).
虽然它不会提高性能本身,如果你多次使用同一个分区,您可能希望使用由astander提出的第二个语法,这不仅是因为它更便宜来写.这就是原因.
请考虑以下查询:
SELECT
array_agg(foo)
OVER (PARTITION BY bar ORDER BY foo),
avg(baz)
OVER (PARTITION BY bar ORDER BY foo)
FROM
foobar;
Run Code Online (Sandbox Code Playgroud)
因为原则上排序对平均值的计算没有影响,所以您可能会尝试使用以下查询(在第二个分区上没有排序):
SELECT
array_agg(foo)
OVER (PARTITION BY bar ORDER BY foo),
avg(baz)
OVER (PARTITION BY bar)
FROM
foobar;
Run Code Online (Sandbox Code Playgroud)
这是一个很大的错误,因为它需要更长的时间.证明:
> EXPLAIN ANALYZE SELECT array_agg(foo) OVER (PARTITION BY bar ORDER BY foo), avg(baz) OVER (PARTITION BY bar ORDER BY foo) FROM foobar;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=215781.92..254591.76 rows=1724882 width=12) (actual time=969.659..2353.865 rows=1724882 loops=1)
-> Sort (cost=215781.92..220094.12 rows=1724882 width=12) (actual time=969.640..1083.039 rows=1724882 loops=1)
Sort Key: bar, foo
Sort Method: quicksort Memory: 130006kB
-> Seq Scan on foobar (cost=0.00..37100.82 rows=1724882 width=12) (actual time=0.027..393.815 rows=1724882 loops=1)
Total runtime: 2458.969 ms
(6 lignes)
> EXPLAIN ANALYZE SELECT array_agg(foo) OVER (PARTITION BY bar ORDER BY foo), avg(baz) OVER (PARTITION BY bar) FROM foobar;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=215781.92..276152.79 rows=1724882 width=12) (actual time=938.733..2958.811 rows=1724882 loops=1)
-> WindowAgg (cost=215781.92..250279.56 rows=1724882 width=12) (actual time=938.699..2033.172 rows=1724882 loops=1)
-> Sort (cost=215781.92..220094.12 rows=1724882 width=12) (actual time=938.683..1062.568 rows=1724882 loops=1)
Sort Key: bar, foo
Sort Method: quicksort Memory: 130006kB
-> Seq Scan on foobar (cost=0.00..37100.82 rows=1724882 width=12) (actual time=0.028..377.299 rows=1724882 loops=1)
Total runtime: 3060.041 ms
(7 lignes)
Run Code Online (Sandbox Code Playgroud)
现在,如果你知道这个问题,你当然会在任何地方使用相同的分区.但是当你有十倍或更多相同的分区并且你要在几天内更新它时,很容易忘记ORDER BY在一个不需要它的分区上添加该子句.
这里提供了WINDOW语法,它可以防止你出现这种粗心的错误(前提是,你知道最好尽量减少不同窗口函数的数量).以下内容EXPLAIN ANALYZE与第一个查询严格相同(据我所知):
SELECT
array_agg(foo)
OVER qux,
avg(baz)
OVER qux
FROM
foobar
WINDOW
qux AS (PARTITION BY bar ORDER BY bar)
Run Code Online (Sandbox Code Playgroud)
我理解" SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo) 不等同于" 的陈述SELECT avg(foo) OVER (PARTITION BY bar)似乎有问题,所以这里有一个例子:
# SELECT * FROM foobar;
foo | bar
-----+-----
1 | 1
2 | 2
3 | 1
4 | 2
(4 lines)
# SELECT array_agg(foo) OVER qux, avg(foo) OVER qux FROM foobar WINDOW qux AS (PARTITION BY bar);
array_agg | avg
-----------+-----
{1,3} | 2
{1,3} | 2
{2,4} | 3
{2,4} | 3
(4 lines)
# SELECT array_agg(foo) OVER qux, avg(foo) OVER qux FROM foobar WINDOW qux AS (PARTITION BY bar ORDER BY foo);
array_agg | avg
-----------+-----
{1} | 1
{1,3} | 2
{2} | 2
{2,4} | 3
(4 lines)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
14160 次 |
| 最近记录: |