Geo*_*ton 5 sql sql-server date-range
想象一下下Loans表:
BorrowerID StartDate DueDate
=============================================
1 2012-09-02 2012-10-01
2 2012-10-05 2012-10-21
3 2012-11-07 2012-11-09
4 2012-12-01 2013-01-01
4 2012-12-01 2013-01-14
1 2012-12-20 2013-01-06
3 2013-01-07 2013-01-22
3 2013-01-15 2013-01-18
1 2013-02-20 2013-02-24
Run Code Online (Sandbox Code Playgroud)
我如何选择BorrowerID那些一次只获得一笔贷款的人?这包括那些只购买过一笔贷款的借款人,以及那些已经拿出一笔以上贷款的借款人,前提是如果你要划出贷款的时间表,那么这些贷款都不会重叠.例如,在上表中,它应该只找到借款人1和2.
我已经尝试过尝试将表加入到自身中,但实际上还没有成功.任何指针都非常感谢!
要解决此问题,您需要一个两步方法,如以下SQL小提琴中所述.我确实在示例数据中添加了一个LoanId列,查询要求存在这样一个唯一的id.如果没有,则需要调整join子句以确保贷款不会与自身匹配.
MS SQL Server 2008架构设置:
CREATE TABLE dbo.Loans
(LoanID INT, [BorrowerID] int, [StartDate] datetime, [DueDate] datetime)
GO
INSERT INTO dbo.Loans
(LoanID, [BorrowerID], [StartDate], [DueDate])
VALUES
(1, 1, '2012-09-02 00:00:00', '2012-10-01 00:00:00'),
(2, 2, '2012-10-05 00:00:00', '2012-10-21 00:00:00'),
(3, 3, '2012-11-07 00:00:00', '2012-11-09 00:00:00'),
(4, 4, '2012-12-01 00:00:00', '2013-01-01 00:00:00'),
(5, 4, '2012-12-01 00:00:00', '2013-01-14 00:00:00'),
(6, 1, '2012-12-20 00:00:00', '2013-01-06 00:00:00'),
(7, 3, '2013-01-07 00:00:00', '2013-01-22 00:00:00'),
(8, 3, '2013-01-15 00:00:00', '2013-01-18 00:00:00'),
(9, 1, '2013-02-20 00:00:00', '2013-02-24 00:00:00')
GO
Run Code Online (Sandbox Code Playgroud)
首先,您需要找出哪些贷款与另一笔贷款重叠.该查询用于<=比较开始日期和截止日期.这计算贷款,其中第二个开始于同一天,第一个开始重叠.如果你需要那些不重叠使用,<而不是在两个地方.
查询1:
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L2.LoanID <> L1.LoanID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate)
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM dbo.Loans L1;
Run Code Online (Sandbox Code Playgroud)
结果:
| LOANID | BORROWERID | STARTDATE | DUEDATE | HASOVERLAPPINGLOAN |
|--------|------------|----------------------------------|---------------------------------|--------------------|
| 1 | 1 | September, 02 2012 00:00:00+0000 | October, 01 2012 00:00:00+0000 | 0 |
| 2 | 2 | October, 05 2012 00:00:00+0000 | October, 21 2012 00:00:00+0000 | 0 |
| 3 | 3 | November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 | 0 |
| 4 | 4 | December, 01 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 | 1 |
| 5 | 4 | December, 01 2012 00:00:00+0000 | January, 14 2013 00:00:00+0000 | 1 |
| 6 | 1 | December, 20 2012 00:00:00+0000 | January, 06 2013 00:00:00+0000 | 0 |
| 7 | 3 | January, 07 2013 00:00:00+0000 | January, 22 2013 00:00:00+0000 | 1 |
| 8 | 3 | January, 15 2013 00:00:00+0000 | January, 18 2013 00:00:00+0000 | 1 |
| 9 | 1 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 0 |
Run Code Online (Sandbox Code Playgroud)
现在,通过该信息,您可以确定此查询没有重叠贷款的借款人:
查询2:
WITH OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L2.LoanID <> L1.LoanID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate)
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM dbo.Loans L1
),
OverlappingBorrower AS (
SELECT BorrowerID, MAX(HasOverlappingLoan) HasOverlappingLoan
FROM OverlappingLoans
GROUP BY BorrowerID
)
SELECT *
FROM OverlappingBorrower
WHERE hasOverlappingLoan = 0;
Run Code Online (Sandbox Code Playgroud)
或者您甚至可以通过计算贷款以及计算数据库中每个借款人的其他贷款重叠的贷款数量来获取更多信息.(注意,如果贷款A和贷款B重叠,这两个都将被此查询计算为重叠贷款)
结果:
| BORROWERID | HASOVERLAPPINGLOAN |
|------------|--------------------|
| 1 | 0 |
| 2 | 0 |
Run Code Online (Sandbox Code Playgroud)
问题3:
WITH OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L2.LoanID <> L1.LoanID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate)
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM dbo.Loans L1
)
SELECT BorrowerID,COUNT(1) LoanCount, SUM(hasOverlappingLoan) OverlappingCount
FROM OverlappingLoans
GROUP BY BorrowerID;
Run Code Online (Sandbox Code Playgroud)
结果:
| BORROWERID | LOANCOUNT | OVERLAPPINGCOUNT |
|------------|-----------|------------------|
| 1 | 3 | 0 |
| 2 | 1 | 0 |
| 3 | 3 | 2 |
| 4 | 2 | 2 |
Run Code Online (Sandbox Code Playgroud)
更新:由于要求实际上要求的解决方案不依赖于每笔贷款的唯一标识符,我做了以下更改:
1)我添加了一个借款人,其中有两笔贷款具有相同的开始和到期日
MS SQL Server 2008架构设置:
CREATE TABLE dbo.Loans
([BorrowerID] int, [StartDate] datetime, [DueDate] datetime)
GO
INSERT INTO dbo.Loans
([BorrowerID], [StartDate], [DueDate])
VALUES
( 1, '2012-09-02 00:00:00', '2012-10-01 00:00:00'),
( 2, '2012-10-05 00:00:00', '2012-10-21 00:00:00'),
( 3, '2012-11-07 00:00:00', '2012-11-09 00:00:00'),
( 4, '2012-12-01 00:00:00', '2013-01-01 00:00:00'),
( 4, '2012-12-01 00:00:00', '2013-01-14 00:00:00'),
( 1, '2012-12-20 00:00:00', '2013-01-06 00:00:00'),
( 3, '2013-01-07 00:00:00', '2013-01-22 00:00:00'),
( 3, '2013-01-15 00:00:00', '2013-01-18 00:00:00'),
( 1, '2013-02-20 00:00:00', '2013-02-24 00:00:00'),
( 5, '2013-02-20 00:00:00', '2013-02-24 00:00:00'),
( 5, '2013-02-20 00:00:00', '2013-02-24 00:00:00')
GO
Run Code Online (Sandbox Code Playgroud)
2)那些"同等日期"贷款需要额外的步骤:
查询1:
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate;
Run Code Online (Sandbox Code Playgroud)
结果:
| BORROWERID | STARTDATE | DUEDATE | LOANCOUNT |
|------------|----------------------------------|---------------------------------|-----------|
| 1 | September, 02 2012 00:00:00+0000 | October, 01 2012 00:00:00+0000 | 1 |
| 1 | December, 20 2012 00:00:00+0000 | January, 06 2013 00:00:00+0000 | 1 |
| 1 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 1 |
| 2 | October, 05 2012 00:00:00+0000 | October, 21 2012 00:00:00+0000 | 1 |
| 3 | November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 | 1 |
| 3 | January, 07 2013 00:00:00+0000 | January, 22 2013 00:00:00+0000 | 1 |
| 3 | January, 15 2013 00:00:00+0000 | January, 18 2013 00:00:00+0000 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 14 2013 00:00:00+0000 | 1 |
| 5 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 2 |
Run Code Online (Sandbox Code Playgroud)
3)现在,每个贷款范围都是唯一的,我们可以再次使用旧技术.但是,我们还需要考虑那些"同等日期"的贷款.(L1.StartDate <> L2.StartDate OR L1.DueDate <> L2.DueDate)防止贷款与自身匹配.OR LoanCount > 1占"同等日期"贷款.
查询2:
WITH NormalizedLoans AS (
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate
)
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate
AND (L1.StartDate <> L2.StartDate
OR L1.DueDate <> L2.DueDate)
)
OR LoanCount > 1
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM NormalizedLoans L1;
Run Code Online (Sandbox Code Playgroud)
结果:
| BORROWERID | STARTDATE | DUEDATE | LOANCOUNT | HASOVERLAPPINGLOAN |
|------------|----------------------------------|---------------------------------|-----------|--------------------|
| 1 | September, 02 2012 00:00:00+0000 | October, 01 2012 00:00:00+0000 | 1 | 0 |
| 1 | December, 20 2012 00:00:00+0000 | January, 06 2013 00:00:00+0000 | 1 | 0 |
| 1 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 1 | 0 |
| 2 | October, 05 2012 00:00:00+0000 | October, 21 2012 00:00:00+0000 | 1 | 0 |
| 3 | November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 | 1 | 0 |
| 3 | January, 07 2013 00:00:00+0000 | January, 22 2013 00:00:00+0000 | 1 | 1 |
| 3 | January, 15 2013 00:00:00+0000 | January, 18 2013 00:00:00+0000 | 1 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 | 1 | 1 |
| 4 | December, 01 2012 00:00:00+0000 | January, 14 2013 00:00:00+0000 | 1 | 1 |
| 5 | February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 | 2 | 1 |
Run Code Online (Sandbox Code Playgroud)
此查询逻辑没有更改(除了切换开头).
问题3:
WITH NormalizedLoans AS (
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate
),
OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate
AND (L1.StartDate <> L2.StartDate
OR L1.DueDate <> L2.DueDate)
)
OR LoanCount > 1
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM NormalizedLoans L1
),
OverlappingBorrower AS (
SELECT BorrowerID, MAX(HasOverlappingLoan) HasOverlappingLoan
FROM OverlappingLoans
GROUP BY BorrowerID
)
SELECT *
FROM OverlappingBorrower
WHERE hasOverlappingLoan = 0;
Run Code Online (Sandbox Code Playgroud)
结果:
| BORROWERID | HASOVERLAPPINGLOAN |
|------------|--------------------|
| 1 | 0 |
| 2 | 0 |
Run Code Online (Sandbox Code Playgroud)
4)在这个计算查询中,我们需要再次纳入"相等日期"贷款计数.为此,我们使用SUM(LoanCount)而不是简单COUNT.我们还必须乘以hasOverlappingLoanLoanCount以再次获得正确的重叠计数.
查询4:
WITH NormalizedLoans AS (
SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
FROM dbo.Loans
GROUP BY BorrowerID, StartDate, DueDate
),
OverlappingLoans AS (
SELECT
*,
CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2
WHERE L2.BorrowerID = L1.BorrowerID
AND L1.StartDate <= L2.DueDate
AND L2.StartDate <= l1.DueDate
AND (L1.StartDate <> L2.StartDate
OR L1.DueDate <> L2.DueDate)
)
OR LoanCount > 1
THEN 1
ELSE 0
END AS HasOverlappingLoan
FROM NormalizedLoans L1
)
SELECT BorrowerID,SUM(LoanCount) LoanCount, SUM(hasOverlappingLoan*LoanCount) OverlappingCount
FROM OverlappingLoans
GROUP BY BorrowerID;
Run Code Online (Sandbox Code Playgroud)
结果:
| BORROWERID | LOANCOUNT | OVERLAPPINGCOUNT |
|------------|-----------|------------------|
| 1 | 3 | 0 |
| 2 | 1 | 0 |
| 3 | 3 | 2 |
| 4 | 2 | 2 |
| 5 | 2 | 2 |
Run Code Online (Sandbox Code Playgroud)
我强烈建议找到一种方法来使用我的第一个解决方案,因为没有主键的贷款表是一个,让我们说"奇怪"的设计.但是,如果你真的无法到达那里,请使用第二种解决方案.