使用 RANK() OVER 计数时跳过空值

Ian*_*oyd 5 sql t-sql sql-server

给定一组行,有时有字段null,有时没有:

SELECT 
   Date, TheThing
FROM MyData
ORDER BY Date


Date                     TheThing
-----------------------  --------
2016-03-09 08:17:29.867  a
2016-03-09 08:18:33.327  a
2016-03-09 14:32:01.240  NULL
2016-10-21 19:53:49.983  NULL
2016-11-12 03:25:21.753  b
2016-11-24 07:43:24.483  NULL
2016-11-28 16:06:23.090  b
2016-11-28 16:09:07.200  c
2016-12-10 11:21:55.807  c
Run Code Online (Sandbox Code Playgroud)

我想要一个计算非空值的排名列:

Date                     TheThing  DesiredTotal
-----------------------  --------  ------------
2016-03-09 08:17:29.867  a         1
2016-03-09 08:18:33.327  a         2
2016-03-09 14:32:01.240  NULL      2 <---notice it's still 2 (good)
2016-10-21 19:53:49.983  NULL      2 <---notice it's still 2 (good)
2016-11-12 03:25:21.753  b         3
2016-11-24 07:43:24.483  NULL      3 <---notice it's still 3 (good)
2016-11-28 16:06:23.090  b         4
2016-11-28 16:09:07.200  c         5
2016-12-10 11:21:55.807  c         6
Run Code Online (Sandbox Code Playgroud)

我尝试显而易见的方法:

SELECT 
   Date, TheThing, 
   RANK() OVER(ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
Run Code Online (Sandbox Code Playgroud)

RANK()计算空值:

Date                     TheThing  Total
-----------------------  --------  -----
2016-03-09 08:17:29.867  a         1
2016-03-09 08:18:33.327  a         2
2016-03-09 14:32:01.240  NULL      3 <--- notice it is 3 (bad)
2016-10-21 19:53:49.983  NULL      4 <--- notice it is 4 (bad)
2016-11-12 03:25:21.753  b         5 <--- and all the rest are wrong (bad)
2016-11-24 07:43:24.483  NULL      7
2016-11-28 16:06:23.090  b         8
2016-11-28 16:09:07.200  c         9
2016-12-10 11:21:55.807  c         10
Run Code Online (Sandbox Code Playgroud)

我如何指示RANK()(或DENSE_RANK())不计算空值?

您尝试过使用分区吗?

为什么是!更糟糕:

SELECT 
   Date, TheThing, 
   RANK() OVER(PARTITION BY(CASE WHEN TheThing IS NOT NULL THEN 1 ELSE 0 END) ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
Run Code Online (Sandbox Code Playgroud)

RANK()计算空值:

Date                     TheThing  Total
-----------------------  --------  -----
2016-03-09 08:17:29.867  a         1
2016-03-09 08:18:33.327  a         2
2016-03-09 14:32:01.240  NULL      1 <--- reset to 1?
2016-10-21 19:53:49.983  NULL      2 <--- why go up?
2016-11-12 03:25:21.753  b         3 
2016-11-24 07:43:24.483  NULL      3 <--- didn't reset?
2016-11-28 16:06:23.090  b         4 
2016-11-28 16:09:07.200  c         5
2016-12-10 11:21:55.807  c         6
Run Code Online (Sandbox Code Playgroud)

现在我随机输入一些东西——疯狂地挥舞。

SELECT 
   Date, TheThing, 
   RANK() OVER(PARTITION BY(CASE WHEN TheThing IS NOT NULL THEN 1 ELSE NULL END) ORDER BY Date) AS Total
FROM MyData
ORDER BY Date

SELECT 
   Date, TheThing, 
   DENSE_RANK() OVER(PARTITION BY(CASE WHEN TheThing IS NOT NULL THEN 1 ELSE NULL END) ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
Run Code Online (Sandbox Code Playgroud)

编辑:有了所有答案,需要多次迭代才能找到我想要的所有边缘情况。最后我在概念上想要的是OVER()为了计数。我不知道OVER可以应用于RANK( 和DENSE_RANK) 以外的任何东西。

http://sqlfiddle.com/#!18/c6d87/1

奖励阅读

Gor*_*off 5

我认为您正在寻找累积计数:

SELECT Date, TheThing, 
       COUNT(theThing) OVER (ORDER BY Date) AS Total
FROM MyData
ORDER BY Date;
Run Code Online (Sandbox Code Playgroud)