Vij*_*jay 5 t-sql sql-server-2008 gaps-and-islands
我一直在努力,但没有得到任何结果,截止日期即将来临.此外,如下所示,有超过一百万行.感谢您对以下内容的帮助.
目标:按成员对结果进行分组,并通过组合各个日期范围为每个成员构建连续覆盖范围,这些日期范围重叠或相互连续运行,并且在范围的开始和结束日之间没有中断.
我有以下格式的数据:
MemberCode ----- ClaimID ----- StartDate ----- EndDate
00001 ----- 012345 ----- 2010-01-15 ----- 2010-01-20
00001 ----- 012350 ----- 2010-01-19 ----- 2010-01-22
00001 ----- 012352 ----- 2010-01-20 ----- 2010-01-25
00001 ----- 012355 ----- 2010-01-26 ----- 2010-01-30
00002 ----- 012357 ----- 2010-01-20 ----- 2010-01-25
00002 ----- 012359 ----- 2010-01-30 ----- 2010-02-05
00002 ----- 012360 ----- 2010-02-04 ----- 2010-02-15
00003 ----- 012365 ----- 2010-02-15 ----- 2010-02-30
Run Code Online (Sandbox Code Playgroud)
...
在上文中,成员(00001)是有效成员,因为从2010-01-15到2010-01-30有连续的日期范围(没有间隙).请注意,此会员的索赔ID 012355紧接在索赔ID 012352的结束日期旁边.这仍然有效,因为它形成一个连续的范围.
但是,成员(00002)应该是无效成员,因为在索赔ID 012357的结束日期与索赔ID的开始日期之间存在5天的间隔012359
我想要做的是获取一份仅列出连续日期范围内每一天(每个成员)的成员的列表,每个成员的MIN(开始日期)和最大日期(结束日期)之间没有间隙不同的成员.有差距的会员将被丢弃.
提前致谢.
更新:
我到达了这里.注意:FILLED_DT = Start Date & PresCoverEndDT = End Date
SELECT PresCoverEndDT, FILLED_DT
FROM
(
SELECT DISTINCT FILLED_DT, ROW_NUMBER() OVER (ORDER BY FILLED_DT) RN
FROM Temp_Claims_PRIOR_STEP_5 T1
WHERE NOT EXISTS
(SELECT * FROM Temp_Claims_PRIOR_STEP_5 T2
WHERE T1.FILLED_DT > T2.FILLED_DT AND T1.FILLED_DT< T2.PresCoverEndDT
AND T1.MBR_KEY = T2.MBR_KEY )
) T1
JOIN (SELECT DISTINCT PresCoverEndDT, ROW_NUMBER() OVER (ORDER BY PresCoverEndDT) RN
FROM Temp_Claims_PRIOR_STEP_5 T1
WHERE NOT EXISTS
(SELECT * FROM Temp_Claims_PRIOR_STEP_5 T2
WHERE T1.PresCoverEndDT > T2.FILLED_DT AND T1.PresCoverEndDT < T2.PresCoverEndDT AND T1.MBR_KEY = T2.MBR_KEY )
) T2
ON T1.RN - 1 = T2.RN
WHERE PresCoverEndDT < FILLED_DT
Run Code Online (Sandbox Code Playgroud)
上面的代码似乎有错误,因为我只得到一行,这也是不正确的.我想要的输出只有1列,如下所示:
Valid_Member_Code
00001
00007
00009
......等等,
试试这个:http://www.sqlfiddle.com/#!3/c3365/20
with s as
(
select *, row_number() over(partition by membercode order by startdate) rn
from tbl
)
,gaps as
(
select a.membercode, a.startdate, a.enddate, b.startdate as nextstartdate
,datediff(d, a.enddate, b.startdate) as gap
from s a
join s b on b.membercode = a.membercode and b.rn = a.rn + 1
)
select membercode
from gaps
group by membercode
having sum(case when gap <= 1 then 1 end) = count(*);
Run Code Online (Sandbox Code Playgroud)
请在此处查看查询进度:http://www.sqlfiddle.com/#!3/c3365/20
工作原理,将当前结束日期与下一个开始日期进行比较,并检查日期差距:
with s as
(
select *, row_number() over(partition by membercode order by startdate) rn
from tbl
)
select a.membercode, a.startdate, a.enddate, b.startdate as nextstartdate
,datediff(d, a.enddate, b.startdate) as gap
from s a
join s b on b.membercode = a.membercode and b.rn = a.rn + 1;
Run Code Online (Sandbox Code Playgroud)
输出:
| MEMBERCODE | STARTDATE | ENDDATE | NEXTSTARTDATE | GAP |
--------------------------------------------------------------
| 1 | 2010-01-15 | 2010-01-20 | 2010-01-19 | -1 |
| 1 | 2010-01-19 | 2010-01-22 | 2010-01-20 | -2 |
| 1 | 2010-01-20 | 2010-01-25 | 2010-01-26 | 1 |
| 2 | 2010-01-20 | 2010-01-25 | 2010-01-30 | 5 |
| 2 | 2010-01-30 | 2010-02-05 | 2010-02-04 | -1 |
Run Code Online (Sandbox Code Playgroud)
然后检查某个成员是否具有相同的声明数量且其声明总数没有差距:
with s as
(
select *, row_number() over(partition by membercode order by startdate) rn
from tbl
)
,gaps as
(
select a.membercode, a.startdate, a.enddate, b.startdate as nextstartdate
,datediff(d, a.enddate, b.startdate) as gap
from s a
join s b on b.membercode = a.membercode and b.rn = a.rn + 1
)
select membercode, count(*) as count, sum(case when gap <= 1 then 1 end) as gapless_count
from gaps
group by membercode;
Run Code Online (Sandbox Code Playgroud)
输出:
| MEMBERCODE | COUNT | GAPLESS_COUNT |
--------------------------------------
| 1 | 3 | 3 |
| 2 | 2 | 1 |
Run Code Online (Sandbox Code Playgroud)
最后,过滤他们,成员在他们的声明中没有间隙:
with s as
(
select *, row_number() over(partition by membercode order by startdate) rn
from tbl
)
,gaps as
(
select a.membercode, a.startdate, a.enddate, b.startdate as nextstartdate
,datediff(d, a.enddate, b.startdate) as gap
from s a
join s b on b.membercode = a.membercode and b.rn = a.rn + 1
)
select membercode
from gaps
group by membercode
having sum(case when gap <= 1 then 1 end) = count(*);
Run Code Online (Sandbox Code Playgroud)
输出:
| MEMBERCODE |
--------------
| 1 |
Run Code Online (Sandbox Code Playgroud)
请注意,您无需COUNT(*) > 1检测具有2个或更多声明的成员.LEFT JOIN我们使用的不是使用,而是JOIN自动丢弃尚未获得第二次索赔的成员.如果您选择使用LEFT JOIN(与上面相同的输出),这是版本(更长):
with s as
(
select *, row_number() over(partition by membercode order by startdate) rn
from tbl
)
,gaps as
(
select a.membercode, a.startdate, a.enddate, b.startdate as nextstartdate
,datediff(d, a.enddate, b.startdate) as gap
from s a
left join s b on b.membercode = a.membercode and b.rn = a.rn + 1
)
select membercode
from gaps
group by membercode
having sum(case when gap <= 1 then 1 end) = count(gap)
and count(*) > 1; -- members who have two ore more claims only
Run Code Online (Sandbox Code Playgroud)
以下是在过滤之前查看上述查询的数据的方法:
with s as
(
select *, row_number() over(partition by membercode order by startdate) rn
from tbl
)
,gaps as
(
select a.membercode, a.startdate, a.enddate, b.startdate as nextstartdate
,datediff(d, a.enddate, b.startdate) as gap
from s a
left join s b on b.membercode = a.membercode and b.rn = a.rn + 1
)
select * from gaps;
Run Code Online (Sandbox Code Playgroud)
输出:
| MEMBERCODE | STARTDATE | ENDDATE | NEXTSTARTDATE | GAP |
-----------------------------------------------------------------
| 1 | 2010-01-15 | 2010-01-20 | 2010-01-19 | -1 |
| 1 | 2010-01-19 | 2010-01-22 | 2010-01-20 | -2 |
| 1 | 2010-01-20 | 2010-01-25 | 2010-01-26 | 1 |
| 1 | 2010-01-26 | 2010-01-30 | (null) | (null) |
| 2 | 2010-01-20 | 2010-01-25 | 2010-01-30 | 5 |
| 2 | 2010-01-30 | 2010-02-05 | 2010-02-04 | -1 |
| 2 | 2010-02-04 | 2010-02-15 | (null) | (null) |
| 3 | 2010-02-15 | 2010-03-02 | (null) | (null) |
Run Code Online (Sandbox Code Playgroud)
编辑要求澄清:
在你的澄清中,你想要包括那些尚未提出第二次索赔的成员,请改为:http://sqlfiddle.com/#!3/c3365/22
with s as
(
select *, row_number() over(partition by membercode order by startdate) rn
from tbl
)
,gaps as
(
select a.membercode, a.startdate, a.enddate, b.startdate as nextstartdate
,datediff(d, a.enddate, b.startdate) as gap
from s a
left join s b on b.membercode = a.membercode and b.rn = a.rn + 1
)
select membercode
from gaps
group by membercode
having sum(case when gap <= 1 then 1 end) = count(gap)
-- members who have yet to have a second claim are valid too
or count(nextstartdate) = 0;
Run Code Online (Sandbox Code Playgroud)
输出:
| MEMBERCODE |
--------------
| 1 |
| 3 |
Run Code Online (Sandbox Code Playgroud)
该技术是对成员进行计数nextstartdate,如果他们没有下一个开始日期日期(即count(nextstartdate) = 0),那么他们只是单一声明并且也是有效的,那么只需附加以下OR条件:
or count(nextstartdate) = 0;
Run Code Online (Sandbox Code Playgroud)
实际上,下面的条件也足够了,我想让查询更加自我记录,因此我建议依靠成员的nextstartdate.这是计算尚未获得第二次索赔的成员的另一个条件:
or count(*) = 1;
Run Code Online (Sandbox Code Playgroud)
顺便说一下,我们还必须改变这种比较:
sum(case when gap <= 1 then 1 end) = count(*)
Run Code Online (Sandbox Code Playgroud)
对此(正如我们现在使用的那样LEFT JOIN):
sum(case when gap <= 1 then 1 end) = count(gap)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5786 次 |
| 最近记录: |