mat*_*hew 9 sql sql-server sql-server-2012
我有一个大数据集,为了这个问题的目的有3个字段:
在任何给定的行上,From Date
总是小于To Date
但在每个组内,由日期对表示的时间段(没有特定顺序)可以重叠,在一个在另一个中包含,或者甚至是相同的.
我最终想要的是一个查询,它将每个组的结果浓缩到连续的时间段.例如,一个看起来像这样的组:
| Group ID | From Date | To Date |
--------------------------------------
| A | 01/01/2012 | 12/31/2012 |
| A | 12/01/2013 | 11/30/2014 |
| A | 01/01/2015 | 12/31/2015 |
| A | 01/01/2015 | 12/31/2015 |
| A | 02/01/2015 | 03/31/2015 |
| A | 01/01/2013 | 12/31/2013 |
Run Code Online (Sandbox Code Playgroud)
会导致这个:
| Group ID | From Date | To Date |
--------------------------------------
| A | 01/01/2012 | 11/30/2014 |
| A | 01/01/2015 | 12/31/2015 |
Run Code Online (Sandbox Code Playgroud)
我已经阅读了许多关于日期包装的文章,但我无法弄清楚如何将其应用到我的数据集中.
如何构建一个能够给我这些结果的查询?
小智 5
《Microsoft® SQL Server® 2012 High-Performance T-SQL Using Window Functions》一书中的解决方案
;with C1 as(
select GroupID, FromDate as ts, +1 as type, 1 as sub
from dbo.table_name
union all
select GroupID, dateadd(day, +1, ToDate) as ts, -1 as type, 0 as sub
from dbo.table_name),
C2 as(
select C1.*
, sum(type) over(partition by GroupID order by ts, type desc
rows between unbounded preceding and current row) - sub as cnt
from C1),
C3 as(
select GroupID, ts, floor((row_number() over(partition by GroupID order by ts) - 1) / 2 + 1) as grpnum
from C2
where cnt = 0)
select GroupID, min(ts) as FromDate, dateadd(day, -1, max(ts)) as ToDate
from C3
group by GroupID, grpnum;
Run Code Online (Sandbox Code Playgroud)
创建表:
if object_id('table_name') is not null
drop table table_name
create table table_name(GroupID varchar(100), FromDate datetime,ToDate datetime)
insert into table_name
select 'A', '01/01/2012', '12/31/2012' union all
select 'A', '12/01/2013', '11/30/2014' union all
select 'A', '01/01/2015', '12/31/2015' union all
select 'A', '01/01/2015', '12/31/2015' union all
select 'A', '02/01/2015', '03/31/2015' union all
select 'A', '01/01/2013', '12/31/2013'
Run Code Online (Sandbox Code Playgroud)