jpm*_*c26 5 oracle oracle-11g-r2
考虑下表:
ID | GROUP_ID | ORDER_VAL | RESET_VAL | VAL
---+----------+-----------+-----------+-----
1 | 1 | 1 | (null) | 3
2 | 1 | 2 | (null) | 2
3 | 1 | 3 | (null) | 1
4 | 1 | 4 | 4 | 2
5 | 1 | 5 | (null) | 1
6 | 2 | 1 | (null) | 4
7 | 2 | 2 | 2 | 3
8 | 2 | 3 | (null) | 4
9 | 2 | 4 | (null) | 2
10 | 2 | 5 | (null) | 2
11 | 2 | 6 | (null) | 4
12 | 2 | 7 | 14 | 2
13 | 2 | 8 | (null) | 2
Run Code Online (Sandbox Code Playgroud)
对于每一行,我需要计算VAL
所有先前行的累积总和(按排序ORDER_VAL
和分组GROUP_ID
),但每次NULL
RESET_VAL
遇到非时,我需要使用该值作为总和。后面的行也需要建立在 之上,RESET_VAL
而不是使用实际总和。请注意,每个组可以有多个重置值。
这是我对上表期望的结果:
ID | GROUP_ID | ORDER_VAL | RESET_VAL | VAL | CUMSUM
---+----------+-----------+-----------+-----+-------
1 | 1 | 1 | (null) | 3 | 0
2 | 1 | 2 | (null) | 2 | 3
3 | 1 | 3 | (null) | 1 | 5
4 | 1 | 4 | 4 | 2 | 4
5 | 1 | 5 | (null) | 1 | 6
6 | 2 | 1 | (null) | 4 | 0
7 | 2 | 2 | 2 | 3 | 2
8 | 2 | 3 | (null) | 4 | 5
9 | 2 | 4 | (null) | 2 | 9
10 | 2 | 5 | (null) | 2 | 11
11 | 2 | 6 | (null) | 4 | 13
12 | 2 | 7 | 14 | 2 | 14
13 | 2 | 8 | (null) | 2 | 16
Run Code Online (Sandbox Code Playgroud)
如果不是重置值,我可以使用窗口查询:
SELECT temp.*,
COALESCE(SUM(val) OVER (PARTITION BY group_id ORDER BY order_val ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),
0) AS cumsum
FROM temp;
Run Code Online (Sandbox Code Playgroud)
我最初错误地认为我可以放在RESET_VAL
的开头COALESCE
,但这不起作用,因为它不会重置后续行的值。
我也尝试过这个解决方案,但它只会重置为零,而不是列中的值。调整它这样做被证明是非常重要的,因为该值必须传播到所有后续行。
递归查询似乎很自然,但我还没有弄清楚如何做到这一点。
我可能应该提一下,我实际要处理的表比上面的例子大得多(几十万到几百万行),所以如果有任何答案,请提及是否有任何性能缺陷。
以下可行,但可能有一些更聪明的版本。查询逻辑说明:
我们首先通过计算列的非空值来查找到当前行(包括当前行)已经完成了多少次“重置” reset_val
,这样我们就可以将行分成子组。
我们还使用了另一个窗口函数,LAST_VALUE()
因此IGNORE NULLS
我们可以找到最后一个reset_value
。
请注意,这两个窗口函数COUNT()
和LAST_VALUE()
都有一个ORDER BY
,因此是默认窗口ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
。查询中省略,让代码更清晰。
假设val
不可为空,其他窗口函数也可以缩短,如下所示:
COALESCE(SUM(val) OVER
(PARTITION BY group_id, reset_count
ORDER BY order_val
ROWS BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING), 0)
Run Code Online (Sandbox Code Playgroud)
(COALESCE()
也避免):
SUM(val) OVER
(PARTITION BY group_id, reset_count
ORDER BY order_val)
- val
Run Code Online (Sandbox Code Playgroud)
最后,在第二个 cte 中,我们使用上面找到的子组(使用PARTITION BY group_id, reset_count
)来查找累积和。
WITH x AS
( SELECT temp.*,
COUNT(reset_val) OVER
(PARTITION BY group_id
ORDER BY order_val)
AS reset_count,
COALESCE(LAST_VALUE(reset_val IGNORE NULLS) OVER
(PARTITION BY group_id
ORDER BY order_val), 0)
AS reset_value
FROM temp
) ,
y AS
( SELECT x.*,
COALESCE(SUM(val) OVER
(PARTITION BY group_id, reset_count
ORDER BY order_val
ROWS BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING), 0)
+ reset_value AS cumsum
FROM x
)
SELECT *
FROM y ;
Run Code Online (Sandbox Code Playgroud)
在SQLfiddle进行测试。
另一种变体,基于@Chris 的递归答案。(略有改进,与非连续一起工作order_val
,避免了最后GROUP BY
)。
如果组的第一行有reset_val
:
WITH row_nums AS
( SELECT id, group_id, order_val, reset_val, val,
ROW_NUMBER() OVER (PARTITION BY group_id
ORDER BY order_val)
AS rn
FROM temp
) ,
updated_temp (id, group_id, order_val, reset_val, val, rn, cumsum) AS
( SELECT id, group_id, order_val, reset_val, val, rn,
COALESCE(reset_val, 0)
FROM row_nums
WHERE rn = 1
UNION ALL
SELECT curr.id, curr.group_id, curr.order_val, curr.reset_val, curr.val, curr.rn,
COALESCE(curr.reset_val, prev.val + prev.cumsum)
FROM row_nums curr
JOIN updated_temp prev
ON curr.rn-1 = prev.rn
AND curr.group_id = prev.group_id
)
SELECT id, group_id, order_val, reset_val, val, cumsum
FROM updated_temp
ORDER BY group_id, order_val ;
Run Code Online (Sandbox Code Playgroud)
在SQLfiddle-2上进行测试。
另一种变体是使用旧的(专有)CONNECT BY
语法进行递归查询。更紧凑,但我发现它比 CTE 版本更难编写和阅读:
WITH row_nums AS
( SELECT id, group_id, order_val, reset_val, val,
ROW_NUMBER() OVER (PARTITION BY group_id
ORDER BY order_val)
AS rn,
COALESCE(reset_val, 0) AS cumsum
FROM temp
)
SELECT id, group_id, order_val, reset_val, val, rn,
COALESCE(reset_val, PRIOR val + PRIOR cumsum, 0) AS cumsum
FROM row_nums
START WITH rn = 1 OR reset_val IS NOT NULL
CONNECT BY rn-1 = PRIOR rn
AND group_id = PRIOR group_id
AND reset_val IS NULL
ORDER BY group_id, order_val ;
Run Code Online (Sandbox Code Playgroud)
在SQLfiddle-3上测试。
归档时间: |
|
查看次数: |
7887 次 |
最近记录: |