Eri*_*rik 7 sql oracle performance self-join query-optimization
我在oracle数据库中有一个表.架构是
create table PERIODS
(
ID NUMBER,
STARTTIME TIMESTAMP,
ENDTIME TIMESTAMP,
TYPE VARCHAR2(100)
)
Run Code Online (Sandbox Code Playgroud)
我有两个不同的TYPE's:TYPEA和TYPEB.具有独立的开始和结束时间,它们可以重叠.我想要找到的是TYPEB那个开始的时期,完全包含或在给定的时期内结束TYPEA.
这是我到目前为止提出的(有一些样本数据)
WITH mydata
AS (SELECT 100 ID,
To_timestamp('2015-08-01 11:00', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 11:20', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEA' TYPE
FROM dual
UNION ALL
SELECT 110 ID,
To_timestamp('2015-08-01 11:30', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 11:50', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEA' TYPE
FROM dual
UNION ALL
SELECT 120 ID,
To_timestamp('2015-08-01 12:00', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 12:20', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEA' TYPE
FROM dual
UNION ALL
SELECT 105 ID,
To_timestamp('2015-08-01 10:55', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 11:05', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEB' TYPE
FROM dual
UNION ALL
SELECT 108 ID,
To_timestamp('2015-08-01 11:05', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 11:15', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEB' TYPE
FROM dual
UNION ALL
SELECT 111 ID,
To_timestamp('2015-08-01 11:15', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 12:25', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEB' TYPE
FROM dual),
typeas
AS (SELECT starttime,
endtime
FROM mydata
WHERE TYPE = 'TYPEA'),
typebs
AS (SELECT id,
starttime,
endtime
FROM mydata
WHERE TYPE = 'TYPEB')
SELECT id
FROM typebs b
join typeas a
ON ( b.starttime BETWEEN a.starttime AND a.endtime )
OR ( b.starttime BETWEEN a.starttime AND a.endtime
AND b.endtime BETWEEN a.starttime AND a.endtime )
OR ( b.endtime BETWEEN a.starttime AND a.endtime )
ORDER BY id;
Run Code Online (Sandbox Code Playgroud)
这似乎原则上起作用,上面的查询结果是
ID
----------
105
108
111
Run Code Online (Sandbox Code Playgroud)
所以它选择TYPEB在第一个TYPEA时期内开始或结束的三个时期.
问题是该表有大约200k条目,并且已经达到这个大小,上面的查询非常慢 - 这对我来说非常令人惊讶,因为两者TYPEA和TYPEB条目的数量都很低(1-2k)
有没有更有效的方法来执行这种类型的自联接?我的查询中是否遗漏了其他内容?
也许值得一试(另外你需要在oracle中最后写出最严格的条件,不要问我为什么也不相信我,最好自己做性能测试):
SELECT
p.id
FROM
periods p
WHERE
EXISTS(SELECT * FROM periods q WHERE
(p.startTime BETWEEN q.startTime AND q.endTime
OR p.endTime BETWEEN q.startTime AND q.endTime
OR p.startTime < q.startTime AND p.endTime > q.endTime -- overlapping correction, remove if not needed
) AND q.type = 'TYPEA'
) AND p.type = 'TYPEB'
ORDER BY
p.id
;
Run Code Online (Sandbox Code Playgroud)