TSQL计数连续记录

Roe*_*and 2 t-sql sql-server grouping partition

假设我有以下记录:

KeyCol     ColA     ColB
------------------------
1          1        A
2          2        B
3          2        B
4          2        C
5          2        B
6          1        A
7          2        B
8          2        B
Run Code Online (Sandbox Code Playgroud)

我希望计算具有此结果的ColA和ColB中具有相同值的连续记录

Col A      ColB     Start   Count
---------------------------------
1          A        1       1
2          B        2       2
2          C        4       1
2          B        5       1
1          A        6       1
2          B        7       2
Run Code Online (Sandbox Code Playgroud)

关于分组和计数有很多类似的问题,但是我没有看到如何将它翻译成这个.特别是许多其他示例没有明确的键列.

我尝试使用PARTITION函数来计算连续记录的数量并从那里获取:

SELECT KeyCol, ColA, ColB
      ,ROW_NUMBER() OVER 
            (   PARTITION
                BY ColA, ColB
                ORDER BY KeyCol
            ) as RowNo
FROM MyTable
Run Code Online (Sandbox Code Playgroud)

但是,这会产生以下结果:

KeyCol    Col A      ColB     RowNo
---------------------------------
1         1          A        1
2         2          B        1
3         2          B        2
4         2          C        1
5         2          B        3   (Needs to be 1)
6         1          A        2   (Needs to be 1)
7         2          B        4   (Needs to be 1)
8         2          B        5   (Needs to be 2)
Run Code Online (Sandbox Code Playgroud)

如您所见,即使记录不连续,所有相同的ColA,ColB的行号也会增加.

非常感谢你!

Gar*_*thD 5

这是一个差距和岛屿问题.您需要使用排名函数来识别ColB的相同值的组(岛).以下查询:

SELECT  KeyCol,
        ColA,
        ColB,
        GroupBy = ROW_NUMBER() OVER(ORDER BY KeyCol) - 
                    ROW_NUMBER() OVER(PARTITION BY ColA, ColB ORDER BY KeyCol)
FROM    dbo.T
ORDER BY KeyCol;
Run Code Online (Sandbox Code Playgroud)

你会得到输出:

KeyCol     ColA     ColB    GroupBy
-----------------------------------------
1          1        A           0
2          2        B           1
3          2        B           1
4          2        C           3
5          2        B           2
6          1        A           4
7          2        B           3
8          2        B           3   
Run Code Online (Sandbox Code Playgroud)

如您所见,这标识了您的岛,其中两个(或更多)连续行具有相同的ColA和ColB值,您将在列中获得相同的值GroupBy.

一旦你有了它,这是一个简单的分组,它可以获得你所需的输出.给出最终查询(包含样本数据):

DECLARE @T TABLE (KeyCol INT, ColA INT, ColB CHAR(1));
INSERT @T (KeyCol, ColA, ColB)
VALUES
    (1, 1, 'A'), (2, 2, 'B'), (3, 2, 'B'), (4, 2, 'C'),
    (5, 2, 'B'), (6, 1, 'A'), (7, 2, 'B'), (8, 2, 'B');

WITH RankedData AS
(   SELECT  KeyCol,
            ColA,
            ColB,
            GroupBy = ROW_NUMBER() OVER(ORDER BY KeyCol) - 
                        ROW_NUMBER() OVER(PARTITION BY ColA, ColB ORDER BY KeyCol)
    FROM    @T
)
SELECT  ColA, 
        ColB,
        Start = MIN(KeyCol),
        [Count] = COUNT(*)
FROM    RankedData
GROUP BY ColA, ColB, GroupBy
ORDER BY Start;
Run Code Online (Sandbox Code Playgroud)