Tia*_*awy 7 sql t-sql sql-server-2016
我有一个如下所示的表格.我想要的是彼此连续的行被组合在一起 - 对于每个"ID".列IsContinued标记下一行是否应与当前行组合
我的数据如下:
+-----+--------+-------------+-----------+----------+
| ID | Period | IsContinued | StartDate | EndDate |
+-----+--------+-------------+-----------+----------+
| 123 | 1 | 1 | 20180101 | 20180404 |
+-----+--------+-------------+-----------+----------+
| 123 | 2 | 1 | 20180501 | 20180910 |
+-----+--------+-------------+-----------+----------+
| 123 | 3 | 0 | 20181001 | 20181201 |
+-----+--------+-------------+-----------+----------+
| 123 | 4 | 1 | 20190105 | 20190228 |
+-----+--------+-------------+-----------+----------+
| 123 | 5 | 0 | 20190401 | 20190430 |
+-----+--------+-------------+-----------+----------+
| 456 | 2 | 1 | 20180201 | 20180215 |
+-----+--------+-------------+-----------+----------+
| 456 | 3 | 0 | 20180301 | 20180401 |
+-----+--------+-------------+-----------+----------+
| 456 | 4 | 0 | 20180501 | 20180530 |
+-----+--------+-------------+-----------+----------+
| 456 | 5 | 0 | 20180701 | 20180705 |
+-----+--------+-------------+-----------+----------+
Run Code Online (Sandbox Code Playgroud)
我想要的最终结果是:
+-----+-------------+-----------+-----------+----------+
| ID | PeriodStart | PeriodEnd | StartDate | EndDate |
+-----+-------------+-----------+-----------+----------+
| 123 | 1 | 3 | 20180101 | 20181201 |
+-----+-------------+-----------+-----------+----------+
| 123 | 4 | 5 | 20190105 | 20190430 |
+-----+-------------+-----------+-----------+----------+
| 456 | 2 | 3 | 20180201 | 20180401 |
+-----+-------------+-----------+-----------+----------+
| 456 | 4 | 4 | 20180501 | 20180530 |
+-----+-------------+-----------+-----------+----------+
| 456 | 5 | 5 | 20180701 | 20180705 |
+-----+-------------+-----------+-----------+----------+
Run Code Online (Sandbox Code Playgroud)
DDL声明:
CREATE TABLE #Period (ID INT, PeriodNr INT, IsContinued INT, STARTDATE DATE, ENDDATE DATE)
INSERT INTO #Period VALUES (123,1,1,'20180101', '20180404'),
(123,2,1,'20180501', '20180910'),
(123,3,0,'20181001', '20181201'),
(123,4,1,'20190105', '20190228'),
(123,5,0,'20190401', '20190430'),
(456,2,1,'20180201', '20180215'),
(456,3,0,'20180301', '20180401'),
(456,4,0,'20180501', '20180530'),
(456,5,0,'20180701', '20180705')
Run Code Online (Sandbox Code Playgroud)
代码应该在SQL Server 2016上运行
谢谢!
这是一种方法:
with removeFluff as
(
SELECT *
FROM (
SELECT ID, PeriodNr, IsContinued, STARTDATE, ENDDATE, LAG(IsContinued,1,2) OVER (PARTITION BY ID ORDER BY PERIODNR) Lag
FROM #Period
) A
WHERE (IsContinued <> Lag) OR (IsContinued + Lag = 0)
)
,getValues as
(
SELECT ID,
CASE WHEN LAG(IsContinued) OVER (PARTITION BY ID ORDER BY PeriodNr) = 1 THEN LAG(PeriodNr) OVER (PARTITION BY ID ORDER BY PeriodNr) ELSE PeriodNr END PeriodStart,
PeriodNr PeriodEnd,
CASE WHEN LAG(IsContinued) OVER (PARTITION BY ID ORDER BY PeriodNr) = 1 THEN LAG(STARTDATE) OVER (PARTITION BY ID ORDER BY PeriodNr) ELSE STARTDATE END StartDate,
EndDate,
IsContinued
FROM removeFluff r
)
SELECT ID, PeriodStart, PeriodEnd, StartDate, EndDate
FROM getValues
WHERE IsContinued = 0
Run Code Online (Sandbox Code Playgroud)
输出:
ID PeriodStart PeriodEnd StartDate EndDate
123 1 3 2018-01-01 2018-12-01
123 4 5 2019-01-05 2019-04-30
456 2 3 2018-02-01 2018-04-01
456 4 4 2018-05-01 2018-05-30
456 5 5 2018-07-01 2018-07-05
Run Code Online (Sandbox Code Playgroud)
方法:
removeFluffcte 删除不重要的行。这些是不开始或结束段的记录(示例数据中的第 2 行)LAG(IsContinued) ... = 0),即。上一行已完成LAG(IsContinued) ... = 1)CASE的表达式中getValuesIsContinued = 0。这是因为我们曾经LAG在“结束”数据行上获取“开始”数据,所以我们只想选择结束行