Ego*_*off 8 sql oracle benchmarking oracle11g
此查询包含16个相等的步骤.
每个步骤都在同一个数据集(单行)上进行相同的计算,
但最后的步骤需要花费太多时间.
with t0 as (select 0 as k from dual)
,t1 as (select k from t0 where k >= (select avg(k) from t0))
,t2 as (select k from t1 where k >= (select avg(k) from t1))
,t3 as (select k from t2 where k >= (select avg(k) from t2))
,t4 as (select k from t3 where k >= (select avg(k) from t3))
,t5 as (select k from t4 where k >= (select avg(k) from t4))
,t6 as (select k from t5 where k >= (select avg(k) from t5))
,t7 as (select k from t6 where k >= (select avg(k) from t6))
,t8 as (select k from t7 where k >= (select avg(k) from t7))
,t9 as (select k from t8 where k >= (select avg(k) from t8))
,t10 as (select k from t9 where k >= (select avg(k) from t9))
,t11 as (select k from t10 where k >= (select avg(k) from t10))
,t12 as (select k from t11 where k >= (select avg(k) from t11)) -- 0.5 sec
,t13 as (select k from t12 where k >= (select avg(k) from t12)) -- 1.3 sec
,t14 as (select k from t13 where k >= (select avg(k) from t13)) -- 4.5 sec
,t15 as (select k from t14 where k >= (select avg(k) from t14)) -- 30 sec
,t16 as (select k from t15 where k >= (select avg(k) from t15)) -- 4 min
select k from t16
Run Code Online (Sandbox Code Playgroud)
子查询t10立即完成,但整个查询(t16)需要4分钟才能完成.
Q1.
为什么相同数据的相同子查询的计算时间差别很大?
Q2.
它看起来像一个错误,因为它运行在Oracle 9速度非常快,在Oracle 11很慢
事实上,在长期和复杂的与子句的SELECT语句都将具有相同的行为方式.
这是一个已知的bug吗?(我无法访问metalink)
建议使用哪种解决方法?
Q3.
我必须为Oracle 11编写代码,我必须在单个select语句中完成所有计算.
我不能在两个单独的陈述中将我的长篇陈述分开以加速它.Oracle中
是否存在提示(或者可能是一些技巧)以使整个查询(t16)在合理的时间内完成(例如,在一秒内)?我试图找到这样一个但无济于事.
顺便说一句,执行计划非常好,而且成本表现为步数的线性函数(非指数).
Q1:似乎没有任何关于计算时间的信息,只是优化算法中的错误,它会在计算最佳执行计划时使其生气.
Q2:Oracle 11.X.0.X中存在许多已知和修复的错误,这些错误与嵌套查询和查询因子分解的优化有关.但是很难找到具体的问题.
Q3:有两个无证提示:materialize
和inline
,但没有他们中的一个为我工作,而我想你的例子.服务器配置或升级到11.2.0.3的某些更改可能会增加嵌套with
子句的限制:对于我(在11.2.0.3 Win7/x86上),您的示例工作正常,但嵌套表的数量增加到30会挂起会话.
解决方法可能如下所示:
select k from (
select k, avg(k) over (partition by null) k_avg from ( --t16
select k, avg(k) over (partition by null) k_avg from ( --t15
select k, avg(k) over (partition by null) k_avg from ( --t14
select k, avg(k) over (partition by null) k_avg from ( --t13
select k, avg(k) over (partition by null) k_avg from ( --t12
select k, avg(k) over (partition by null) k_avg from ( --t11
select k, avg(k) over (partition by null) k_avg from ( --t10
select k, avg(k) over (partition by null) k_avg from ( --t9
select k, avg(k) over (partition by null) k_avg from ( --t8
select k, avg(k) over (partition by null) k_avg from ( --t7
select k, avg(k) over (partition by null) k_avg from ( --t6
select k, avg(k) over (partition by null) k_avg from ( --t5
select k, avg(k) over (partition by null) k_avg from ( --t4
select k, avg(k) over (partition by null) k_avg from ( --t3
select k, avg(k) over (partition by null) k_avg from ( --t2
select k, avg(k) over (partition by null) k_avg from ( -- t1
select k, avg(k) over (partition by null) k_avg from (select 0 as k from dual) t0
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
) where k >= k_avg
)
Run Code Online (Sandbox Code Playgroud)
至少它为我的作品对30嵌套水平,然后用产生完全不同的执行计划WINDOW BUFFER
和VIEW
代替LOAD TABLE AS SELECT
,SORT AGGREGATE
和TABLE ACCESS FULL
.
更新
刚刚安装了11.2.0.4(Win7/32bit)并根据初始查询进行测试.优化器行为没有任何改变.
即使使用inline
(未记录的)或RULE
(不推荐的)提示,也不可能直接影响CBO行为.可能是一些Guru知道一些变体,但它对我来说是一个绝密(也是谷歌:-).
如果主select语句分成一个部分并放入返回一组行的函数(函数返回sys_refcursor或强类型游标),则可以在合理的时间内在一个select语句中执行操作,但如果查询不是一个选择在运行时构造.
使用XML的解决方法是可行的,但这种变体看起来像是通过屁眼洞移除扁桃体(对不起):
.
select
extractvalue(column_value,'/t/somevalue') abc
from
table(xmlsequence((
select t2 from (
select
t0,
t1,
(
select xmlagg(
xmlelement("t",
xmlelement("k1",extractvalue(t1t.column_value,'/t/k1')),
xmlelement("somevalue", systimestamp))
)
from
table(xmlsequence(t0)) t0t,
table(xmlsequence(t1)) t1t
where
extractvalue(t1t.column_value,'/t/k1') >= (
select avg(extractvalue(t1t.column_value, '/t/k1')) from table(xmlsequence(t1))
)
and
extractvalue(t0t.column_value,'/t/k2') > 6
) t2
from (
select
t0,
(
select xmlagg(
xmlelement("t",
xmlelement("k1",extractvalue(column_value,'/t/k1')),
xmlelement("somevalue", sysdate))
)
from table(xmlsequence(t0))
where
extractvalue(column_value,'/t/k1') >= (
select avg(extractvalue(column_value, '/t/k1')) from table(xmlsequence(t0))
)
) t1
from (
select
xmlagg(xmlelement("t", xmlelement("k1", level), xmlelement("k2", level + 3))) t0
from dual connect by level < 5
)
)
)
)))
Run Code Online (Sandbox Code Playgroud)
关于上面的奇怪代码的另一个问题是,此变体仅适用于with
数据集没有大量行的情况.
(这不是完整的答案。希望此处的信息可以帮助其他人提供更好的答案。)
Q1:优化器通过内联所有内容来重写查询。每个新的公共表表达式的内部语句的大小都会加倍,并且语句会迅速变得巨大。例如,T15 生成 3,162,172 个字符的查询。
跟踪语句的代码:
sqlplus user/pass@orcl
alter session set events '10053 trace name context forever, level 1';
with t0 as (select 0 as k from dual)
,t1 as (select k from t0 where k >= (select avg(k) from t0))
,t2 as (select k from t1 where k >= (select avg(k) from t1))
select k from t2;
exit;
sqlplus user/pass@orcl
alter session set events '10053 trace name context forever, level 1';
with t0 as (select 0 as k from dual)
,t1 as (select k from t0 where k >= (select avg(k) from t0))
,t2 as (select k from t1 where k >= (select avg(k) from t1))
,t3 as (select k from t2 where k >= (select avg(k) from t2))
select k from t3;
exit;
Run Code Online (Sandbox Code Playgroud)
如果比较两个跟踪文件,就会发现有很多差异,但大多数差异看起来很小。真正的区别仅在于字符串后面的一行:Stmt: ******* UNPARSED QUERY IS *******
。如果跟踪较大的查询,请小心打开跟踪文件。并非所有编辑器都能处理如此长的行。T20 文件有 250MB!
格式化后来自第一条跟踪的 SQL:
SELECT "T1"."K" "K"
FROM (SELECT 0 "K"
FROM "SYS"."DUAL" "DUAL"
WHERE 0 >= (SELECT AVG(0) "AVG(K)" FROM "SYS"."DUAL" "DUAL")) "T1"
WHERE "T1"."K" >=
(SELECT AVG("T1"."K") "AVG(K)"
FROM (SELECT 0 "K"
FROM "SYS"."DUAL" "DUAL"
WHERE 0 >= (SELECT AVG(0) "AVG(K)" FROM "SYS"."DUAL" "DUAL")) "T1")
Run Code Online (Sandbox Code Playgroud)
格式化后来自第二个跟踪的 SQL:
SELECT "T2"."K" "K"
FROM (SELECT "T1"."K" "K"
FROM (SELECT 0 "K"
FROM "SYS"."DUAL" "DUAL"
WHERE 0 >= (SELECT AVG(0) "AVG(K)" FROM "SYS"."DUAL" "DUAL")) "T1"
WHERE "T1"."K" >=
(SELECT AVG("T1"."K") "AVG(K)"
FROM (SELECT 0 "K"
FROM "SYS"."DUAL" "DUAL"
WHERE 0 >=
(SELECT AVG(0) "AVG(K)" FROM "SYS"."DUAL" "DUAL")) "T1")) "T2"
WHERE "T2"."K" >=
(SELECT AVG("T2"."K") "AVG(K)"
FROM (SELECT "T1"."K" "K"
FROM (SELECT 0 "K"
FROM "SYS"."DUAL" "DUAL"
WHERE 0 >=
(SELECT AVG(0) "AVG(K)" FROM "SYS"."DUAL" "DUAL")) "T1"
WHERE "T1"."K" >=
(SELECT AVG("T1"."K") "AVG(K)"
FROM (SELECT 0 "K"
FROM "SYS"."DUAL" "DUAL"
WHERE 0 >= (SELECT AVG(0) "AVG(K)"
FROM "SYS"."DUAL" "DUAL")) "T1")) "T2")
Run Code Online (Sandbox Code Playgroud)
Q2:我不会说每个“复杂”公用表表达式都会以相同的方式运行。我见过更大的 CTE。似乎只有极端的嵌套才是问题所在。我在 Oracle 支持上找不到任何明显的错误。
ThinkJet 的代码看起来是一个很好的解决方法。嵌套内联视图比嵌套公用表表达式更常见。
Q3:可能有一个提示可以防止这种行为,但我不确定它是什么。希望通过显示查询的转换版本,其他人可以猜测如何修复它。