如果我有一个使用类似的窗口函数
SELECT *, row_number() OVER (ORDER BY something) FROM table
Run Code Online (Sandbox Code Playgroud)
结果应该排序吗?
我目前正在 Microsoft SQL Server 中测试我的查询,它肯定是已排序的,但我知道该产品倾向于在未询问时对行进行排序。
这是标准行为吗?
这是示例:
drop table if exists tst;
create table tst (
num integer not null
);
insert into tst values (1), (2), (3);
-- window functions WITH order by clause
select *, max(num) over (partition by true order by num asc), array_agg(num) over (partition by true order by num asc) as test
from tst;
-- window functions WITHOUT order by clause
select *, max(num) over (partition by true), array_agg(num) over (partition by true) as test
from tst;
Run Code Online (Sandbox Code Playgroud)
结果如下:
为什么order by子句对聚合功能有影响?
假设我们有以下查询。
\nselect\n name,\n pos,\n rank() over (partition by constructor) r,\n format('%s / %s',\n row_number() over (partition by constructor),\n count(*) over (partition by constructor)\n ) "pos/global"\nfrom (values\n ('d1-c1', 1, 'c1'),\n ('d3-c1', 3, 'c1'),\n ('d3-c2', 3, 'c2'),\n ('d2-c1', 2, 'c1'),\n ('d2-c2', 2, 'c2')\n ) t(name, pos, constructor);\nRun Code Online (Sandbox Code Playgroud)\n输出如下:
\n name \xe2\x94\x82 pos \xe2\x94\x82 r \xe2\x94\x82 pos/global\n\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\xaa\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\xaa\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\xaa\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\n d1-c1 \xe2\x94\x82 1 \xe2\x94\x82 1 \xe2\x94\x82 1 / 3\n d3-c1 \xe2\x94\x82 3 \xe2\x94\x82 1 \xe2\x94\x82 2 / 3\n d2-c1 \xe2\x94\x82 2 \xe2\x94\x82 1 \xe2\x94\x82 …Run Code Online (Sandbox Code Playgroud) 聚合和窗口函数的语法令人困惑,这不起作用,
SELECT (max(x)+5) OVER ()
FROM generate_series(1,10) AS t(x);
Run Code Online (Sandbox Code Playgroud)
虽然这有效
SELECT 5+max(x) OVER ()
FROM generate_series(1,10) AS t(x);
Run Code Online (Sandbox Code Playgroud)
如果运算符是可交换的,那很好,但我试图将间隔除以秒(在我的特定情况下,尽管我解决了它)。有没有办法简化这个。
SELECT (max(x)/5) OVER ()
FROM generate_series(1,10) AS t(x);
Run Code Online (Sandbox Code Playgroud)
所以不需要另一个封装查询?这是一个 PostgreSQL 的东西,还是一个 SQL 的东西?有没有办法消除查询的歧义?
WITH trips_by_day AS
(
SELECT DATE(trip_start_timestamp) AS trip_date,
COUNT(*) as num_trips
FROM `bigquery-public-data.chicago_taxi_trips.taxi_trips`
WHERE trip_start_timestamp >= '2016-01-01' AND trip_start_timestamp < '2018-01-01'
GROUP BY trip_date
ORDER BY trip_date
)
SELECT trip_date,
avg(num_trips)
OVER (
order by trip_date
rows between 15 preceding and 15 following
) AS avg_num_trips
FROM trips_by_day
Run Code Online (Sandbox Code Playgroud)
谁能给我解释一下的意思rows between 15 preceding and 15 following?