MJ.*_*MJ. 0 postgresql pivot redshift aggregate-filter
具体来说,我有一个事件表,用于记录用户加入或离开团队的时间。它看起来像下面这样:
-------------------------------------
| user | event | team | timestamp |
-------------------------------------
| A | joined | 1 | 2016-1-1 |
| B | joined | 1 | 2016-1-1 |
| C | left | 1 | 2016-1-1 |
| C | joined | 2 | 2016-1-1 |
| A | left | 1 | 2016-1-2 |
| A | joined | 2 | 2016-1-2 |
| B | left | 1 | 2016-1-3 |
| A | left | 2 | 2016-1-3 |
-------------------------------------
Run Code Online (Sandbox Code Playgroud)
我需要对其进行重组,以使其看起来如下所示
--------------------------------------
| user | team | joined | left |
--------------------------------------
| A | 1 | 2016-1-1 | 2016-1-2 |
| A | 2 | 2016-1-2 | 2016-1-3 |
| B | 1 | 2016-1-1 | 2016-1-3 |
| C | 1 | null | 2016-1-1 |
| C | 2 | 2016-1-1 | null |
--------------------------------------
Run Code Online (Sandbox Code Playgroud)
我怎样才能做到这一点?
有关更多详细信息,我正在尝试在 Amazon Redshift (PostgreSQL) 中执行此操作
假设所有列NOT NULL
。并且“left”永远不会早于关联的“joined”。
如果用户只能加入一次团队(理想情况下这将通过对 的UNIQUE
约束来强制执行("user", team)
),那么该解决方案很简单,GROUP BY
并且适用于 Redshift 以及大多数任何 RDBMS:
SELECT "user", team
, min(CASE WHEN event = 'joined' THEN timestamp END) AS joined
, max(CASE WHEN event = 'left' THEN timestamp END) AS "left"
FROM event
GROUP BY "user", team
ORDER BY "user", joined NULLS FIRST;
Run Code Online (Sandbox Code Playgroud)
注意NULLS FIRST
条款。似乎您想首先对开放式开始进行排序joined IS NULL
。Redshift 也支持这一点。
除此之外,它是交叉表/数据透视查询的最基本形式。
从您的列名和示例数据来看,它可能并不那么简单。如果用户可以多次加入团队(非重叠),则您必须做更多的工作。您不希望像在此相关答案中那样将多个团队成员资格合并为一行:
相反,您必须以某种方式将相邻的“加入”和“左”行配对。有很多方法...
对于现代 Postgres,我最喜欢这个:
SELECT "user", team
, min(timestamp) FILTER (WHERE event = 'joined') AS joined
, max(timestamp) FILTER (WHERE event = 'left' ) AS "left"
FROM (
SELECT *, count(*) FILTER (WHERE event = 'joined')
OVER (PARTITION BY "user", team ORDER BY timestamp) AS ct
FROM event
) sub
GROUP BY "user", team, ct
ORDER BY "user", joined NULLS FIRST;
Run Code Online (Sandbox Code Playgroud)
FILTER
在窗口函数和聚合函数中使用聚合子句。相关(带有替代品的链接):
这样我们就可以计算同一个用户加入同一个团队的次数,这样我们就可以对相邻的行进行分组。适用于'joined'
开头丢失或'left'
结尾丢失。
...不支持新FILTER
条款。我们可以用一个普通的 old 代替CASE
:
SELECT "user", team
, min(CASE WHEN event = 'joined' THEN timestamp END) AS joined
, max(CASE WHEN event = 'left' THEN timestamp END) AS "left"
FROM (
SELECT *, count(CASE WHEN event = 'joined' THEN 1 END)
OVER (PARTITION BY "user", team ORDER BY timestamp, event) AS ct
FROM event
) sub
GROUP BY "user", team, ct
ORDER BY "user", joined NULLS FIRST;
Run Code Online (Sandbox Code Playgroud)
旁白:即使 Redshift(或 Postgres)允许,您也不应该使用保留字作为标识符。
归档时间: |
|
查看次数: |
111 次 |
最近记录: |