我正在尝试组合多个日期范围(我的负载大约为最多 500 个,大多数情况下为 10 个),这些日期范围可能会或可能不会重叠到最大的可能连续日期范围中。例如:
数据:
CREATE TABLE test (
id SERIAL PRIMARY KEY NOT NULL,
range DATERANGE
);
INSERT INTO test (range) VALUES
(DATERANGE('2015-01-01', '2015-01-05')),
(DATERANGE('2015-01-01', '2015-01-03')),
(DATERANGE('2015-01-03', '2015-01-06')),
(DATERANGE('2015-01-07', '2015-01-09')),
(DATERANGE('2015-01-08', '2015-01-09')),
(DATERANGE('2015-01-12', NULL)),
(DATERANGE('2015-01-10', '2015-01-12')),
(DATERANGE('2015-01-10', '2015-01-12'));
Run Code Online (Sandbox Code Playgroud)
表看起来像:
id | range
----+-------------------------
1 | [2015-01-01,2015-01-05)
2 | [2015-01-01,2015-01-03)
3 | [2015-01-03,2015-01-06)
4 | [2015-01-07,2015-01-09)
5 | [2015-01-08,2015-01-09)
6 | [2015-01-12,)
7 | [2015-01-10,2015-01-12)
8 | [2015-01-10,2015-01-12)
(8 rows)
Run Code Online (Sandbox Code Playgroud)
预期结果:
combined
--------------------------
[2015-01-01, 2015-01-06)
[2015-01-07, 2015-01-09)
[2015-01-10, ) …
Run Code Online (Sandbox Code Playgroud) 这个问题类似于优化 IP 范围搜索?但那个仅限于 SQL Server 2000。
假设我有 1000 万个范围临时存储在一个表中,结构和填充如下。
CREATE TABLE MyTable
(
Id INT IDENTITY PRIMARY KEY,
RangeFrom INT NOT NULL,
RangeTo INT NOT NULL,
CHECK (RangeTo > RangeFrom),
INDEX IX1 (RangeFrom,RangeTo),
INDEX IX2 (RangeTo,RangeFrom)
);
WITH RandomNumbers
AS (SELECT TOP 10000000 ABS(CRYPT_GEN_RANDOM(4)%100000000) AS Num
FROM sys.all_objects o1,
sys.all_objects o2,
sys.all_objects o3,
sys.all_objects o4)
INSERT INTO MyTable
(RangeFrom,
RangeTo)
SELECT Num,
Num + 1 + CRYPT_GEN_RANDOM(1)
FROM RandomNumbers
Run Code Online (Sandbox Code Playgroud)
我需要知道包含值的所有范围50,000,000
。我尝试以下查询
SELECT *
FROM MyTable
WHERE 50000000 …
Run Code Online (Sandbox Code Playgroud) 执行此查询(匿名)大约需要 2 分钟。
SELECT
ly.Col1
,sr.Col2
,sr.Col3
,sr.Col4
INTO TempDb..TempLYT
FROM Tempdb..T1 ly
JOIN TempDb..T2 sr on sr.[DateTimeCol] BETWEEN ly.DateTimeStart and ly.DateTimeEnd
WHERE sr.Col5 = 1 OR sr.Col5 = 2
Run Code Online (Sandbox Code Playgroud)
是否有一些替代方案可以帮助解决此查询?
应用 Paul White 建议的索引后,查询计划如下所示: