SQL Server:row_number按超时分区

usr*_*usr 5 t-sql sql-server olap row-number sql-server-2012

我有一个包含一系列(IP varchar(15),DateTime datetime2)值的表.每行对应于用户发出的HTTP请求.我想为这些行分配会话号.不同的IP地址具有不同的会话号.如果最后一个请求超过30分钟,则应为同一IP分配新的会话号.这是一个示例输出:

IP,      DateTime,         SessionNumber, RequestNumber
1.1.1.1, 2012-01-01 00:01, 1,             1
1.1.1.1, 2012-01-01 00:02, 1,             2
1.1.1.1, 2012-01-01 00:03, 1,             3
1.1.1.2, 2012-01-01 00:04, 2,             1 --different IP => new session number
1.1.1.2, 2012-01-01 00:05, 2,             2
1.1.1.2, 2012-01-01 00:40, 3,             1 --same IP, but last request 35min ago (> 30min)
Run Code Online (Sandbox Code Playgroud)

第1列和第2列是输入,第3和第4列是所需的输出.该表显示了两个用户.

由于底层是表真的很大,如何有效地解决这个问题?我更喜欢对数据进行少量的传递(一次或两次).

Mar*_*ith 8

这里有几次尝试.

;WITH CTE1 AS
(
SELECT *,
IIF(DATEDIFF(MINUTE,
       LAG(DateTime) OVER (PARTITION BY IP ORDER BY DateTime),
       DateTime) < 30,0,1) AS SessionFlag
FROM Sessions
), CTE2 AS
(
SELECT *,
       SUM(SessionFlag) OVER (PARTITION BY IP 
                                  ORDER BY DateTime) AS IPSessionNumber
FROM CTE1
)
SELECT IP,
       DateTime,
       DENSE_RANK() OVER (ORDER BY IP, IPSessionNumber) AS SessionNumber,
       ROW_NUMBER() OVER (PARTITION BY IP, IPSessionNumber 
                              ORDER BY DateTime) AS RequestNumber
FROM CTE2
Run Code Online (Sandbox Code Playgroud)

这有两个排序操作(到IP, DateTime那时IP, IPSessionNumber),但假设SessionNumber可以任意分配,只要根据ip地址/ 30分钟规则为每个新会话分配不同的唯一会话号.

SessionNumber按时间顺序依次分配s.我使用了以下内容.

;WITH CTE1 AS
(
SELECT *,
IIF(DATEDIFF(MINUTE,
       LAG(DateTime) OVER (PARTITION BY IP ORDER BY DateTime),
       DateTime) < 30,0,1) AS SessionFlag
FROM Sessions
), CTE2 AS(
SELECT *,
       SUM(SessionFlag) OVER (ORDER BY DateTime) AS GlobalSessionNo
FROM CTE1
), CTE3 AS(
SELECT *,
       MAX(CASE WHEN SessionFlag = 1 THEN GlobalSessionNo END) 
               OVER (PARTITION BY IP ORDER BY DateTime) AS SessionNumber
FROM CTE2)
SELECT IP,
       DateTime,
       SessionNumber,
       ROW_NUMBER() OVER (PARTITION BY SessionNumber 
                              ORDER BY DateTime) AS RequestNumber
FROM CTE3
Run Code Online (Sandbox Code Playgroud)

然而,这将排序操作的数量增加到4.

  • @usr - 可能值得评估一个游标和一个`#temp`表!不确定这是否在某处[OVER子句增强请求 - 渐进式有序计算](http://connect.microsoft.com/SQLServer/feedback/details/254397/over-clause-enhancement-request-progressive-ordered-calculations)会有所帮助. (2认同)