计算SQL Server中的运行总计

cod*_*ike 158 sql t-sql sql-server running-total

想象一下下表(称为TestTable):

id     somedate    somevalue
--     --------    ---------
45     01/Jan/09   3
23     08/Jan/09   5
12     02/Feb/09   0
77     14/Feb/09   7
39     20/Feb/09   34
33     02/Mar/09   6
Run Code Online (Sandbox Code Playgroud)

我想要一个以日期顺序返回运行总计的查询,例如:

id     somedate    somevalue  runningtotal
--     --------    ---------  ------------
45     01/Jan/09   3          3
23     08/Jan/09   5          8
12     02/Feb/09   0          8
77     14/Feb/09   7          15  
39     20/Feb/09   34         49
33     02/Mar/09   6          55
Run Code Online (Sandbox Code Playgroud)

我知道在SQL Server 2000/2005/2008中有各种方法可以做到这一点.

我对使用aggregate-set-statement技巧的这种方法特别感兴趣:

INSERT INTO @AnotherTbl(id, somedate, somevalue, runningtotal) 
   SELECT id, somedate, somevalue, null
   FROM TestTable
   ORDER BY somedate

DECLARE @RunningTotal int
SET @RunningTotal = 0

UPDATE @AnotherTbl
SET @RunningTotal = runningtotal = @RunningTotal + somevalue
FROM @AnotherTbl
Run Code Online (Sandbox Code Playgroud)

...这非常有效但我听说有这方面的问题,因为你不一定能保证UPDATE语句将以正确的顺序处理行.也许我们可以就这个问题得到一些明确的答案.

但也许人们可以建议其他方式?

编辑:现在使用带有设置的SqlFiddle和上面的"更新技巧"示例

Sam*_*ron 127

更新,如果您运行的是SQL Server 2012,请参阅:https://stackoverflow.com/a/10309947

问题是Over子句的SQL Server实现有些限制.

Oracle(和ANSI-SQL)允许您执行以下操作:

 SELECT somedate, somevalue,
  SUM(somevalue) OVER(ORDER BY somedate 
     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
          AS RunningTotal
  FROM Table
Run Code Online (Sandbox Code Playgroud)

SQL Server为您提供此问题的干净解决方案.我的直觉告诉我,这是极少数情况下光标最快的情况之一,但我必须对大结果做一些基准测试.

更新技巧很方便但我觉得它相当脆弱.看来,如果要更新完整的表,那么它将按主键的顺序进行.因此,如果您将日期设置为主键升序,那么您将probably是安全的.但是你依赖于一个未记录的SQL Server实现细节(如果查询最终由两个proc执行,我想知道会发生什么,请参阅:MAXDOP):

完整的工作样本:

drop table #t 
create table #t ( ord int primary key, total int, running_total int)

insert #t(ord,total)  values (2,20)
-- notice the malicious re-ordering 
insert #t(ord,total) values (1,10)
insert #t(ord,total)  values (3,10)
insert #t(ord,total)  values (4,1)

declare @total int 
set @total = 0
update #t set running_total = @total, @total = @total + total 

select * from #t
order by ord 

ord         total       running_total
----------- ----------- -------------
1           10          10
2           20          30
3           10          40
4           1           41
Run Code Online (Sandbox Code Playgroud)

你问了一个基准,这是低点.

执行此操作的最快SAFE方式是Cursor,它比交叉连接的相关子查询快一个数量级.

绝对最快的方法是UPDATE技巧.我唯一担心的是,我不确定在任何情况下更新都会以线性方式进行.查询中没有明确说明的内容.

底线,对于生产代码,我会使用光标.

测试数据:

create table #t ( ord int primary key, total int, running_total int)

set nocount on 
declare @i int
set @i = 0 
begin tran
while @i < 10000
begin
   insert #t (ord, total) values (@i,  rand() * 100) 
    set @i = @i +1
end
commit
Run Code Online (Sandbox Code Playgroud)

测试1:

SELECT ord,total, 
    (SELECT SUM(total) 
        FROM #t b 
        WHERE b.ord <= a.ord) AS b 
FROM #t a

-- CPU 11731, Reads 154934, Duration 11135 
Run Code Online (Sandbox Code Playgroud)

测试2:

SELECT a.ord, a.total, SUM(b.total) AS RunningTotal 
FROM #t a CROSS JOIN #t b 
WHERE (b.ord <= a.ord) 
GROUP BY a.ord,a.total 
ORDER BY a.ord

-- CPU 16053, Reads 154935, Duration 4647
Run Code Online (Sandbox Code Playgroud)

测试3:

DECLARE @TotalTable table(ord int primary key, total int, running_total int)

DECLARE forward_cursor CURSOR FAST_FORWARD 
FOR 
SELECT ord, total
FROM #t 
ORDER BY ord


OPEN forward_cursor 

DECLARE @running_total int, 
    @ord int, 
    @total int
SET @running_total = 0

FETCH NEXT FROM forward_cursor INTO @ord, @total 
WHILE (@@FETCH_STATUS = 0)
BEGIN
     SET @running_total = @running_total + @total
     INSERT @TotalTable VALUES(@ord, @total, @running_total)
     FETCH NEXT FROM forward_cursor INTO @ord, @total 
END

CLOSE forward_cursor
DEALLOCATE forward_cursor

SELECT * FROM @TotalTable

-- CPU 359, Reads 30392, Duration 496
Run Code Online (Sandbox Code Playgroud)

测试4:

declare @total int 
set @total = 0
update #t set running_total = @total, @total = @total + total 

select * from #t

-- CPU 0, Reads 58, Duration 139
Run Code Online (Sandbox Code Playgroud)

  • @Martin Denali将为这个http://msdn.microsoft.com/en-us/library/ms189461(v=SQL.110).aspx提供一个非常好的解决方案. (3认同)
  • 最初的(Oracle(和ANSI-SQL))答案现在可以在SQL Server 2017中使用。谢谢,非常优雅! (2认同)

Mik*_*son 117

在SQL Server 2012中,您可以将SUM()OVER()子句一起使用.

select id,
       somedate,
       somevalue,
       sum(somevalue) over(order by somedate rows unbounded preceding) as runningtotal
from TestTable
Run Code Online (Sandbox Code Playgroud)

SQL小提琴


Rom*_*kar 40

虽然Sam Saffron在这方面做了很多工作,但他仍然没有提供这个问题的递归公用表表达式代码.对于使用SQL Server 2008 R2而不是Denali的我们来说,它仍然是获得总计运行速度的最快方式,它比我的工作计算机上的光标快10倍,而且它也是内联查询.
所以,这里是(我假设表中有一ord列,它的序列号没有间隙,为了快速处理,这个数字也应该有唯一约束):

;with 
CTE_RunningTotal
as
(
    select T.ord, T.total, T.total as running_total
    from #t as T
    where T.ord = 0
    union all
    select T.ord, T.total, T.total + C.running_total as running_total
    from CTE_RunningTotal as C
        inner join #t as T on T.ord = C.ord + 1
)
select C.ord, C.total, C.running_total
from CTE_RunningTotal as C
option (maxrecursion 0)

-- CPU 140, Reads 110014, Duration 132
Run Code Online (Sandbox Code Playgroud)

sql fiddle demo

更新 我也很好奇这个更新与变量奇怪的更新.所以通常它工作正常,但我们如何确保它每次都有效?好吧,这里有一个小技巧(在这里找到 - http://www.sqlservercentral.com/Forums/Topic802558-203-21.aspx#bm981258) - 你只需检查当前和之前的情况ord并使用1/0任务,以防它们与什么不同你期待:

declare @total int, @ord int

select @total = 0, @ord = -1

update #t set
    @total = @total + total,
    @ord = case when ord <> @ord + 1 then 1/0 else ord end,
    ------------------------
    running_total = @total

select * from #t

-- CPU 0, Reads 58, Duration 139
Run Code Online (Sandbox Code Playgroud)

从我所看到的,如果你的表上有适当的聚集索引/主键(在我们的例子中它将是索引ord_id),更新将一直以线性方式进行(从未遇到除以零).也就是说,由您来决定是否要在生产代码中使用它:)

  • 这个答案值得更多的认可(或者它有一些我看不到的缺陷?) (6认同)
  • 对于已经拥有数据序数并且在SQL 2008 R2上寻找基于简洁(非游标)集的解决方案的情况,这似乎是完美的. (2认同)

小智 28

SQL 2005及更高版本中的APPLY运算符适用于此:

select
    t.id ,
    t.somedate ,
    t.somevalue ,
    rt.runningTotal
from TestTable t
 cross apply (select sum(somevalue) as runningTotal
                from TestTable
                where somedate <= t.somedate
            ) as rt
order by t.somedate
Run Code Online (Sandbox Code Playgroud)

  • 适用于较小的数据集.缺点是你必须在内部和外部查询上使用相同的where子句. (5认同)

Sam*_*Axe 11

SELECT TOP 25   amount, 
    (SELECT SUM(amount) 
    FROM time_detail b 
    WHERE b.time_detail_id <= a.time_detail_id) AS Total FROM time_detail a
Run Code Online (Sandbox Code Playgroud)

您还可以使用ROW_NUMBER()函数和临时表来创建在内部SELECT语句的比较中使用的任意列.


Kth*_*rog 7

使用相关的子查询.很简单,你走了:

SELECT 
somedate, 
(SELECT SUM(somevalue) FROM TestTable t2 WHERE t2.somedate<=t1.somedate) AS running_total
FROM TestTable t1
GROUP BY somedate
ORDER BY somedate
Run Code Online (Sandbox Code Playgroud)

代码可能不完全正确,但我确信这个想法是正确的.

如果日期出现不止一次,则GROUP BY只会在结果集中看到一次.

如果您不介意重复日期,或者想要查看原始值和ID,那么以下是您想要的:

SELECT 
id,
somedate, 
somevalue,
(SELECT SUM(somevalue) FROM TestTable t2 WHERE t2.somedate<=t1.somedate) AS running_total
FROM TestTable t1
ORDER BY somedate
Run Code Online (Sandbox Code Playgroud)


A-K*_*A-K 5

您还可以非规范化 - 在同一个表中存储运行总计:

http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/23/denormalizing-to-enforce-business-rules-running-totals.aspx

选择工作的速度比任何其他解决方案快得多,但修改可能会更慢


小智 5

如果您使用的是 Sql server 2008 R2 以上。那么,这将是最短的方法;

Select id
    ,somedate
    ,somevalue,
LAG(runningtotal) OVER (ORDER BY somedate) + somevalue AS runningtotal
From TestTable 
Run Code Online (Sandbox Code Playgroud)

LAG用于获取前一行值。你可以做谷歌了解更多信息。

[1]: