如何获得累计金额

ps.*_*ps. 165 sql-server sql-server-2008

declare  @t table
    (
        id int,
        SomeNumt int
    )

insert into @t
select 1,10
union
select 2,12
union
select 3,3
union
select 4,15
union
select 5,23


select * from @t
Run Code Online (Sandbox Code Playgroud)

上面的选择返回以下内容.

id  SomeNumt
1   10
2   12
3   3
4   15
5   23
Run Code Online (Sandbox Code Playgroud)

我如何得到以下内容

id  srome   CumSrome
1   10  10
2   12  22
3   3   25
4   15  40
5   23  63
Run Code Online (Sandbox Code Playgroud)

Red*_*ter 203

select t1.id, t1.SomeNumt, SUM(t2.SomeNumt) as sum
from @t t1
inner join @t t2 on t1.id >= t2.id
group by t1.id, t1.SomeNumt
order by t1.id
Run Code Online (Sandbox Code Playgroud)

SQL小提琴示例

产量

| ID | SOMENUMT | SUM |
-----------------------
|  1 |       10 |  10 |
|  2 |       12 |  22 |
|  3 |        3 |  25 |
|  4 |       15 |  40 |
|  5 |       23 |  63 |
Run Code Online (Sandbox Code Playgroud)

编辑:这是一个适用于大多数数据库平台的通用解决方案.如果您的特定平台有更好的解决方案(例如,gareth),请使用它!

  • @Franklin只对小型桌子具有成本效益.成本增长与行数的平方成正比.SQL Server 2012允许更高效地完成此操作. (11认同)
  • FWIW,DBA这样做的时候我的指关节很厉害.我认为原因是它变得非常昂贵,非常快.话虽这么说,这是一个很好的面试问题,因为大多数数据分析师/科学家应该不得不一次或两次解决这个问题:) (3认同)

小智 175

最新版本的SQL Server(2012)允许以下内容.

SELECT 
    RowID, 
    Col1,
    SUM(Col1) OVER(ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
Run Code Online (Sandbox Code Playgroud)

要么

SELECT 
    GroupID, 
    RowID, 
    Col1,
    SUM(Col1) OVER(PARTITION BY GroupID ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
Run Code Online (Sandbox Code Playgroud)

这甚至更快.对于我来说,分区版本在34秒内完成超过500万行.

感谢Peso,他评论了另一个答案中提到的SQL Team线程.

  • 为简洁起见,您可以使用"ROWS UNBOUNDED PRECEDING"而不是"UNWOUNDED PRECEDING AND CURRENT ROW"之间的"ROWS". (21认同)
  • 注意:如果要累计求和的列本身已经是总和或计数,则可以将整个内容包装为内部查询,也可以实际执行 `SUM(COUNT(*)) OVER (ORDER BY RowId ROWS UNBOUNDED PRECEDING ) 作为累积总和`。对我来说它是否会起作用并不是很明显,但它确实起作用了:-) (2认同)

小智 22

对于SQL Server 2012以上,它可能很简单:

SELECT id, SomeNumt, sum(SomeNumt) OVER (ORDER BY id) as CumSrome FROM @t
Run Code Online (Sandbox Code Playgroud)

因为默认情况下,ORDER BY子句SUM意味着RANGE UNBOUNDED PRECEDING AND CURRENT ROW窗口框架(https://msdn.microsoft.com/en-us/library/ms189461.aspx上的 "一般说明" )


Dam*_*vic 12

一个CTE版本,只是为了好玩:

;
WITH  abcd
        AS ( SELECT id
                   ,SomeNumt
                   ,SomeNumt AS MySum
             FROM   @t
             WHERE  id = 1
             UNION ALL
             SELECT t.id
                   ,t.SomeNumt
                   ,t.SomeNumt + a.MySum AS MySum
             FROM   @t AS t
                    JOIN abcd AS a ON a.id = t.id - 1
           )
  SELECT  *  FROM    abcd
OPTION  ( MAXRECURSION 1000 ) -- limit recursion here, or 0 for no limit.
Run Code Online (Sandbox Code Playgroud)

返回:

id          SomeNumt    MySum
----------- ----------- -----------
1           10          10
2           12          22
3           3           25
4           15          40
5           23          63
Run Code Online (Sandbox Code Playgroud)


Nee*_*rma 11

让我们首先创建一个带有虚拟数据的表 - >

Create Table CUMULATIVESUM (id tinyint , SomeValue tinyint)

**Now let put some data in the table**

Insert Into CUMULATIVESUM

Select 1, 10 union 
Select 2, 2  union
Select 3, 6  union
Select 4, 10 
Run Code Online (Sandbox Code Playgroud)

在这里我加入同桌(SELF加入)

Select c1.ID, c1.SomeValue, c2.SomeValue
From CumulativeSum c1,  CumulativeSum c2
Where c1.id >= c2.ID
Order By c1.id Asc
Run Code Online (Sandbox Code Playgroud)

结果:

ID  SomeValue   SomeValue
1   10          10
2   2           10
2   2            2
3   6           10
3   6            2
3   6            6
4   10          10
4   10           2
4   10           6
4   10          10
Run Code Online (Sandbox Code Playgroud)

在这里我们现在只是总结t2的Somevalue,我们将得到ans

Select c1.ID, c1.SomeValue, Sum(c2.SomeValue) CumulativeSumValue
From CumulativeSum c1,  CumulativeSum c2
Where c1.id >= c2.ID
Group By c1.ID, c1.SomeValue
Order By c1.id Asc
Run Code Online (Sandbox Code Playgroud)

FOR SQL SERVER 2012及更高版本(更好的表现)

Select c1.ID, c1.SomeValue, 
SUM (SomeValue) OVER (ORDER BY c1.ID )
From CumulativeSum c1
Order By c1.id Asc
Run Code Online (Sandbox Code Playgroud)

期望的结果

ID  SomeValue   CumlativeSumValue
1   10          10
2   2           12
3   6           18
4   10          28

Drop Table CumulativeSum
Run Code Online (Sandbox Code Playgroud)

清除虚拟表


小智 5

Select 
    *, 
    (Select Sum(SOMENUMT) 
     From @t S 
     Where S.id <= M.id)
From @t M
Run Code Online (Sandbox Code Playgroud)

  • @RaRdEvA 虽然这对性能来说并不是很好,但它对结果集的每一行都运行“相关子查询”,并随着它扫描越来越多的行。它不会像窗口函数那样保持运行总数和扫描数据一次。 (2认同)
  • @Davos 你是对的,如果你使用它会变得非常慢,超过 100,000 条记录。 (2认同)

Adi*_*tya 5

迟到的答案但显示出更多的可能性......

可以使用CROSS APPLY逻辑更优化累积和生成.

在分析实际查询计划时,工作效果优于INNER JOIN&OVER Clause

/* Create table & populate data */
IF OBJECT_ID('tempdb..#TMP') IS NOT NULL
DROP TABLE #TMP 

SELECT * INTO #TMP 
FROM (
SELECT 1 AS id
UNION 
SELECT 2 AS id
UNION 
SELECT 3 AS id
UNION 
SELECT 4 AS id
UNION 
SELECT 5 AS id
) Tab


/* Using CROSS APPLY 
Query cost relative to the batch 17%
*/    
SELECT   T1.id, 
         T2.CumSum 
FROM     #TMP T1 
         CROSS APPLY ( 
         SELECT   SUM(T2.id) AS CumSum 
         FROM     #TMP T2 
         WHERE    T1.id >= T2.id
         ) T2

/* Using INNER JOIN 
Query cost relative to the batch 46%
*/
SELECT   T1.id, 
         SUM(T2.id) CumSum
FROM     #TMP T1
         INNER JOIN #TMP T2
                 ON T1.id > = T2.id
GROUP BY T1.id

/* Using OVER clause
Query cost relative to the batch 37%
*/
SELECT   T1.id, 
         SUM(T1.id) OVER( PARTITION BY id)
FROM     #TMP T1

Output:-
  id       CumSum
-------   ------- 
   1         1
   2         3
   3         6
   4         10
   5         15
Run Code Online (Sandbox Code Playgroud)

  • 我没有被说服。“相对于批次的查询成本”对于比较查询的性能来说毫无意义。查询成本是查询计划器用来快速权衡不同计划并选择成本最低的估计值,但这些成本用于比较_相同查询_的计划,并且在_查询之间_之间不相关或不具有可比性,根本不是。这个样本数据集也太小,看不出三种方法之间的任何显着差异。用 1m 行再试一次,看看实际的执行计划,用 `set io statistics on` 试一下,比较 cpu 和实际时间。 (2认同)