标签: window-functions

ORDER BY 子句中的“排名”值？

我有下表：

                                     Table "public.employee_employee"
     Column      |         Type          |                           Modifiers                            
-----------------+-----------------------+----------------------------------------------------------------
 id              | integer               | not null default nextval('employee_employee_id_seq'::regclass)
 name            | text                  | not null
 slug            | character varying(50) | not null
 title           | text                  | not null
 base            | numeric(10,2)         | 
 gross           | numeric(10,2)         | 
 overtime        | numeric(10,2)         | 
 benefits        | numeric(10,2)         | 
 total           | numeric(10,2)         | 
 other           | numeric(10,2)         | 
 year            | smallint              | not null
 jurisdiction_id | integer               | not null
 notes           | text                  | 
Indexes:
    "employee_employee_pkey" PRIMARY KEY, …

Run Code Online (Sandbox Code Playgroud)

postgresql window-functions

eda*_*vis

2014 03-15

5
推荐指数

1
解决办法

265
查看次数

计算出库单价

目前我有这样的数据，

CREATE TABLE foo (id, qty, unit_price, date, action) AS
    VALUES
    (1,  2000, 4.01235, '2015-10-10'::timestamp, 'in'),
    (2,   -30, NULL   , '2015-10-11'::timestamp, 'out'),
    (3,  1800, 4.9    , '2015-10-25'::timestamp, 'in'),
    (4, -1000, NULL   , '2015-11-12'::timestamp, 'out'),
    (5,  -980, NULL   , '2015-11-20'::timestamp, 'out');

Run Code Online (Sandbox Code Playgroud)

我需要计算传出行的平均价格，以便结果看起来像这样。

身份证 | 数量 | unit_price | 日期 | 行动
----------------------------------------------
1 | 2000 | 4.01235 | 2015-10-10 | 在
2 | -30 |    4.01235 | 2015-10-11 | 出去
3 | 1800 | 4.9 | 2015-10-25 | 在
4 | …

postgresql window-functions

Tii*_*mel

2018 05-03

5
推荐指数

1
解决办法

465
查看次数

了解为什么 rank() over 不适合不选择重复行

我想了解为什么我有不同的结果

我有一个名为 active_transfert 的表，用于记录图像传输

user_id | image_id | created_at
--------|----------|-----------
1       |1         |2014-07-10
1       |2         |2015-01-21
2       |1         |2015-05-23
3       |1         |2016-07-22
4       |6         |2017-06-01
4       |6         |2014-08-22

Run Code Online (Sandbox Code Playgroud)

我想为每个 image_id 返回唯一的 user_id。

SELECT user_id,
       image_id
FROM active_transfert
GROUP BY user_id,
         image_id; --50


SELECT user_id,
       image_id
FROM
  (SELECT user_id,
          image_id,
          rank() OVER (PARTITION BY user_id, image_id
                       ORDER BY created_at DESC) AS i_ranked
   FROM active_transfert) AS i
WHERE i.i_ranked = 1; -- 53

Run Code Online (Sandbox Code Playgroud)

我对 Redshift 运行这些查询。为什么我的第二个查询不能防止重复记录（相同的 user_id 和 image_id）？

预期结果：

user_id …

Run Code Online (Sandbox Code Playgroud)

window-functions redshift rank

Mio*_*Mio

2017 04-12

5
推荐指数

1
解决办法

1220
查看次数

计算 3 年的回撤和波动率

我有一张每日价格表，如下所示，在这个SQLfiddle 中。每条记录都与 ShareClassID、货币、日期和金额相关联。在我的真实场景中，我可以拥有至少 10 年的每日价格。

目标是计算3 年期间而非整个期间的回撤和波动性。

下面的查询正在对所有记录进行计算。对于给定的日期，我只想对过去 3 年进行计算。我不知道该怎么做。

添加AND valueDate >= DATEADD(yy, -3, GETDATE())到WHERE条款并没有给我我需要的东西：我需要从当前valueDate记录执行 3 年计算。在每一行上，我想从当前行到 -3 年执行计算。

样本数据

CREATE TABLE [dbo].ShareClassData2( 
valueDate  [date] NOT NULL,
 NAVLocal  [numeric](26, 8) NULL,
currency_fk   [bigint] NOT NULL,
vehicleShareClassGroup_fk   [bigint] NOT NULL,
 importTransactionID    [bigint] NOT NULL
)

INSERT INTO [dbo].[ShareClassData2]( [valueDate] ,[NAVLocal],[currency_fk],[vehicleShareClassGroup_fk],[importTransactionID]) VALUES (  '2014-09-29',113.49,12,22370,1  );
INSERT INTO [dbo].[ShareClassData2]( [valueDate] ,[NAVLocal],[currency_fk],[vehicleShareClassGroup_fk],[importTransactionID]) VALUES (  '2014-09-30',113.75,12,22370,1  );
INSERT INTO [dbo].[ShareClassData2]( [valueDate] ,[NAVLocal],[currency_fk],[vehicleShareClassGroup_fk],[importTransactionID]) …

Run Code Online (Sandbox Code Playgroud)

sql-server window-functions sql-server-2016

Gae*_*n56

2018 05-31

5
推荐指数

1
解决办法

803
查看次数

SQL 挑战 - 传感器阈值异常报告

我添加了一个不使用窗口函数的解决方案和一个基准测试，其中包含一个低于 Martin's Answer 的大数据集

这是GROUP BY 使用不在 SELECT 列表中的列的后续线程- 这什么时候实用、优雅或强大？

在我对这一挑战的解决方案中，我使用了一个查询，该查询按不属于选择列表的表达式进行分组。当逻辑分组元素涉及来自其他行的数据时，这经常与窗口函数一起使用。

也许这是一个矫枉过正的例子，但我认为你可能会发现挑战本身很有趣。我会等待发布我的解决方案，也许你们中的一些人可以提出更好的解决方案。

挑战

我们有一个定期记录读数值的传感器表。无法保证采样时间处于单调间隔。

您需要编写一个查询来报告“异常”，这意味着传感器报告的读数超出阈值的次数，无论是低还是高。传感器报告超过或低于阈值的每个时间段都被视为“例外”。一旦读数恢复正常，异常结束。

示例表和数据

该脚本采用 T-SQL 格式，是我的培训材料的一部分。

这是 SQLFiddle 的链接。

------------------------------------------
-- Sensor Thresholds - 1 - Setup Example --
------------------------------------------

CREATE TABLE [Sensors]
(
    [Sensor] NVARCHAR(10) NOT NULL,
    [Lower Threshold] DECIMAL(7,2) NOT NULL,
    [Upper Threshold] DECIMAL(7,2) NOT NULL,
    CONSTRAINT [PK Sensors] 
        PRIMARY KEY CLUSTERED ([Sensor]),
    CONSTRAINT [CK Value Range]
        CHECK ([Upper Threshold] > [Lower Threshold])
);
GO

INSERT INTO [Sensors]
( 
    [Sensor] , …

Run Code Online (Sandbox Code Playgroud)

t-sql window-functions group-by

SQL*_*tor

2018 12-08

5
推荐指数

2
解决办法

280
查看次数

如果至少是负数，则求和；否则就显示

仅当分区中存在负值时，我才需要将值相加。如果分区中没有负值，它应该只输出行。

这就是我现在所拥有的。初始数据作为 CTE 提供。

数据管理语言

;with ledger as (
    select accountId, type, amount
    from (
        values
         (1, 'R', -10)
        ,(1, 'V', 10)
        ,(1, 'R', 30)
        ,(2, 'R', 20)
        ,(2, 'R', -5)
        ,(2, 'V', 5)
        ,(3, 'R', 20)
        ,(3, 'R', 30)
    )   x (accountId, type, amount)
)
,b as ( --identifies accountid, type with negatives
    select
        accountid
        ,type               
    from ledger
    group by accountid, type
    having min(amount) < 0
)
,onlyPositives as (
    select
         l.accountid
        ,l.type
        ,l.amount
    from ledger l
    left join …

Run Code Online (Sandbox Code Playgroud)

sql-server aggregate window-functions group-by sql-server-2016

Kev*_*vin

2018 10-20

5
推荐指数

1
解决办法

472
查看次数

重叠范围的最大 sum()

基本上我的问题是：如何在 PostgreSQL 9.3（或 9.4）中进行涉及重叠范围的聚合操作？我手头的具体问题是，给定一个范围，我想找到适用重叠范围的最大 sum()。一个简单的例子：

create table event (
  event_id int primary key,
  event_type_id int not null,
  period tstzrange not null,
  quantity int not null
);

insert into event (event_id, event_type_id, period, quantity) values
(1, 1,'[2016-01-06 09:00:00+00,2016-01-08 17:00:00+00]',1),
(2, 1,'[2016-01-07 09:00:00+00,2016-01-07 11:00:00+00]',1),
(3, 1,'[2016-01-07 13:00:00+00,2016-01-07 17:00:00+00]',1),
(4, 2,'[2016-01-07 12:00:00+00,2016-01-07 17:00:00+00]',1);

Run Code Online (Sandbox Code Playgroud)

给定具有以下子句的查询：

select ... 
where event_type_id = 1
and period && '[2016-01-07 00:00:00+00,2016-01-07 23:59:00+00]'::tstzrange 
group by event_type_id

Run Code Online (Sandbox Code Playgroud)

期望的结果是：3，即在给定时间戳范围内sum(quantity)相同范围event_type_id重叠的最大值。

postgresql window-functions postgresql-9.3 range-types

Rob*_*Rob

2015 02-04

4
推荐指数

1
解决办法

3489
查看次数

在多列上使用 lag() （间隙和岛屿问题）

我有一个问题正在尝试解决，我需要使用前 7 列作为基础来消除表中重复的连续行，以确定它是否重复。数据如下：

textColA|textColB|textColC|textColD|textColE|numCol1|numCol2|eventNum
--------+--------+--------+--------+--------+-------+-------+--------
    x   |    y   |    z   |  foo   |  bar   |  1    | 2     | 1
    x   |    y   |    z   |  foo   |  bar   |  1    | 2     | 2 
    x   |    y   |    z   |  foo   |  bar   |  1.1  | 2     | 3
    x   |    y   |    z   |  foo   |  bar   |  1.1  | 2     | 4
    x   |    y   |    z   |  foo   |  bar   |  1    | 2     | 5 …

Run Code Online (Sandbox Code Playgroud)

postgresql window-functions

Tom*_*yer

2019 04-26

4
推荐指数

1
解决办法

1万
查看次数

为什么更改窗口函数“MAX() OVER()”的ORDER BY部分中的列会影响最终结果？

我有一个具有以下结构的表，它是数据：

create table test_table
(
Item_index   int,
Item_name    varchar(50)
)

insert into test_table (Item_index,Item_name) values (0,'A')
insert into test_table (Item_index,Item_name) values (1,'B')
insert into test_table (Item_index,Item_name) values (0,'C')
insert into test_table (Item_index,Item_name) values (1,'D')
insert into test_table (Item_index,Item_name) values (0,'E')

Run Code Online (Sandbox Code Playgroud)

我想知道为什么更改order by查询部分中的列会更改结果？在QUERY-1, I useditem_index和QUERY-2I useditem_name列中按部分顺序排列。我认为这两个查询必须生成相同的结果because I used item_index in both queries for partitioning!我现在完全困惑了！为什么按列排序会影响最终结果？

查询-1：

select t.*,
       max(t.Item_name)over(partition by t.item_index order by item_index) new_column
from test_table t;

Run Code Online (Sandbox Code Playgroud)

结果：

Item_index …

Run Code Online (Sandbox Code Playgroud)

sql-server query t-sql window-functions sql-server-2016

Pan*_*tea

lucky-day

4
推荐指数

1
解决办法

688
查看次数

聚合重叠的日期间隔

我有一个包含一些帐户的表，以及他们的订阅的开始和结束日期。但是，这些订阅有时会重叠，我需要每个连接订阅期的开始日期和结束日期。就像示例图像中一样。

我尝试将订阅期与日期参考表合并，并标记日期（如果有订阅）。然而，代码变得相当复杂。我想一定有一个更简单的解决方案。

IF OBJECT_ID('tempdb..#Subscriptions') IS NOT NULL DROP TABLE #Subscriptions
CREATE TABLE #Subscriptions (
    account_id varchar(1)
    ,start_date date
    ,end_date date
)

INSERT INTO #Subscriptions (account_id, start_date, end_date) values
('A','2019-06-20','2019-06-29'),
('A','2019-06-25','2019-07-25'),
('A','2019-07-20','2019-08-26'),
('A','2019-12-25','2020-01-25'),
('A','2021-04-27','2021-07-27'),
('A','2021-06-25','2021-07-14'),
('A','2021-07-10','2021-08-14'),
('A','2021-09-10','2021-11-12'),
('B','2019-07-13','2020-07-14'),
('B','2019-06-25','2019-08-26')

Run Code Online (Sandbox Code Playgroud)

sql-server aggregate window-functions date

900*_*_db

2022 08-26

4
推荐指数

1
解决办法

1559
查看次数