如何考虑中间的其他行来计算重复的行

0 sql-server sql-server-2008-r2

对于 SQL Server 2008 R2 对以下内容提出单个查询似乎具有挑战性:给定列 a 和 b 的示例:

a |   b
-------------
0 |  2000
1 |  2001
1 |  2002
1 |  2003
2 |  2004
3 |  2005
1 |  2006
1 |  2007
4 |  2008
1 |  2009
Run Code Online (Sandbox Code Playgroud)

目标:标记具有重复列 a 的行,并在考虑到其他值的情况下为它们指定唯一编号。结果应在 c 列中。注意这里最困难的部分是用 2 & 5 & 7 填充列 c。

a |  b   |  c
-------------
0 |  2000 | 1
1 |  2001 | 2
1 |  2002 | 2
1 |  2003 | 2
2 |  2004 | 3
3 |  2005 | 4
1 |  2006 | 5
1 |  2007 | 5
4 |  2008 | 6
1 |  2009 | 7
Run Code Online (Sandbox Code Playgroud)

ype*_*eᵀᴹ 5

这是一个问题。一种(众多)解决方法(这需要 2012+ 版本):

WITH 
  t AS
    ( SELECT a, b, x = CASE WHEN a = LAG(a) OVER (ORDER BY b) 
                           THEN NULL ELSE 1 
                       END
      FROM table_name
    )
SELECT a, b, c = COUNT(x) OVER (ORDER BY b) 
FROM t 
ORDER BY b ;
Run Code Online (Sandbox Code Playgroud)

这应该适用于 2005 年及以上:

WITH 
  t AS
    ( SELECT a, b, dx = ROW_NUMBER() OVER (ORDER BY b) 
                        - ROW_NUMBER() OVER (PARTITION BY a ORDER BY b) 
      FROM table_name
    ),
  tt AS
    ( SELECT a, b, mb = MIN(b) OVER (PARTITION BY a, dx)
      FROM t 
    )
SELECT a, b, c = DENSE_RANK() OVER (ORDER BY mb)
FROM tt 
ORDER BY b ;
Run Code Online (Sandbox Code Playgroud)