我终于问了我的第一个问题(虽然我是一个长期跟踪者).
SQL查询在前几天引起了我的注意.WHERE当使用INoperator 将索引与可能的值进行比较时,问题是performance in 子句.
SELECT SUM (parts.quantity) AS quantity,
concessions.concessionCode,
concessions.description AS concessionDesc,
parts.type,
activities.activityCode,
REPLACE (activities.activityCode, activities.lvl2 || '-', '') AS activityCodeDisplay,
strings.activityDesc,
strings.activityDesc2,
strings.activityDesc3
FROM tb_parts parts,
tb_activities activities,
tb_strings strings,
tb_concessions concessions
WHERE parts.activityCode = activities.activityCode
AND parts.concessionCode = activities.concessionCode
AND activities.concessionCode = concesions.concessionCode
AND activities.concessionCode = strings.concessionCode
AND activities.activityCode = strings.activityCode
AND strings.language = 'ENG'
--AND parts.concesionCode IN ('ZD', 'G9', 'TR', 'JS0')
AND parts.concesionCode IN ('ZD', 'G9')
AND parts.date >= TO_DATE ('01/01/2013 00:00:00', 'DD/MM/YYYY HH24:MI:SS')
AND parts.date <= TO_DATE ('30/04/2013 23:59:59', 'DD/MM/YYYY HH24:MI:SS')
AND parts.type IN ('U', 'M')
AND parts.value = 'E'
GROUP BY concesions.concessionCode,
concesions.description,
parts.type,
activities.activityCode,
REPLACE (activities.activityCode, activities.lvl2|| '-', ''),
strings.activityDesc,
strings.activityDesc2,
strings.activityDesc3
ORDER BY concesions.concessionCode;
Run Code Online (Sandbox Code Playgroud)
我遇到的问题是 - 如果查询是这样运行的(有两个值IN),则需要30秒.如果它使用四个值运行(就像它在注释行中一样),则查询需要5秒.我希望将索引与多个值进行比较会花费更多时间,但似乎并非如此.我在白天多次重复"测试",它们总是或多或少相同(30 + -1s,5 + -1s).
任何洞察为什么这样做的行为将不仅仅是值得赞赏!
PS我已经翻译了表/列的名称,如果有任何差异,请对不起.
PPS我用连接重写了这段代码,速度要快得多,但这个异常背后的原因仍然困扰着我:)
编辑:终于上班了!一些修修补补之后,我已经能够为这两个版本,甚至第三版本的查询创建执行计划(有两个在4倍2的值where,时间大约是600毫秒).此外,关于表中的数据有几个问题,所以这里有一些信息:
All the stats are analyzed the day that queries were executed
Table parts
total rows - 3.2 M
matches for 2 values - 1.08 M (~34%)
matches for 4 values - 1.30 M (~41%)
Table activities
total rows - 3866
matches for 2 values - 321 (~ 8%)
matches for 4 values - 644 (~16%)
Table strings
total rows - 7436
matches for 2 values - 642 (~ 8%)
matches for 4 values - 1288 (~17%)
Index in_parts
codConcession
username
date
Run Code Online (Sandbox Code Playgroud)
因此,我认为使用动态采样时没有重大区别(除了+ 2/3s)(如果我做得正确,也就是说,/*+ dynamic_sampling(tb_parts 10) */在SELECT关键字之后)
对于两个值:
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 186 | 864 (1)| 00:00:11 |
| 1 | SORT ORDER BY | | 1 | 186 | 864 (1)| 00:00:11 |
| 2 | HASH GROUP BY | | 1 | 186 | 864 (1)| 00:00:11 |
|* 3 | TABLE ACCESS BY INDEX ROWID | tb_parts | 1 | 37 | 818 (1)| 00:00:10 |
| 4 | NESTED LOOPS | | 1 | 186 | 862 (1)| 00:00:11 |
| 5 | NESTED LOOPS | | 1 | 149 | 44 (0)| 00:00:01 |
| 6 | NESTED LOOPS | | 34 | 2108 | 10 (0)| 00:00:01 |
| 7 | INLIST ITERATOR | | | | | |
| 8 | TABLE ACCESS BY INDEX ROWID| tb_concesions | 2 | 54 | 2 (0)| 00:00:01 |
|* 9 | INDEX UNIQUE SCAN | pk_concession | 2 | | 1 (0)| 00:00:01 |
| 10 | TABLE ACCESS BY INDEX ROWID | tb_activities | 17 | 595 | 4 (0)| 00:00:01 |
|* 11 | INDEX RANGE SCAN | pk_activity | 17 | | 2 (0)| 00:00:01 |
| 12 | TABLE ACCESS BY INDEX ROWID | tb_strings | 1 | 87 | 1 (0)| 00:00:01 |
|* 13 | INDEX UNIQUE SCAN | pk_string | 1 | | 0 (0)| 00:00:01 |
|* 14 | INDEX RANGE SCAN | in_parts | 454 | | 648 (1)| 00:00:08 |
-----------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("parts"."value"='E'
AND ("parts"."type"='M' OR "parts"."type"='U')
AND "parts"."activityCode"="activities"."activityCode")
9 - access("concessions"."concessionCode"='G9'
OR "concessions"."concessionCode"='ZD')
11 - access("activities"."concessionCode"="concessions"."concessionCode")
filter("activities"."concessionCode"='G9'
OR "activities"."concessionCode"='ZD')
13 - access("activities"."concessionCode"="strings"."concessionCode"
AND "activities"."activityCode"="strings"."activityCode"
AND "strings"."language"='ENG')
filter("strings"."concessionCode"='G9'
OR "strings"."concessionCode"='ZD')
14 - access("parts"."concessionCode"="activities"."concessionCode"
AND "parts"."date">=TO_DATE('2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss'))
filter("parts"."date">=TO_DATE('2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND ("parts"."concessionCode"='G9'
OR "parts"."concessionCode"='ZD')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss'))
Run Code Online (Sandbox Code Playgroud)
对于四个值:
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 186 | 7412 (2)| 00:01:29 |
| 1 | SORT ORDER BY | | 1 | 186 | 7412 (2)| 00:01:29 |
| 2 | HASH GROUP BY | | 1 | 186 | 7412 (2)| 00:01:29 |
| 3 | NESTED LOOPS | | 1 | 186 | 7410 (2)| 00:01:29 |
|* 4 | HASH JOIN | | 17 | 1683 | 7393 (2)| 00:01:29 |
|* 5 | HASH JOIN | | 136 | 8432 | 21 (5)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| tb_concesions | 4 | 108 | 2 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | pk_concession | 4 | | 1 (0)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | tb_activities | 644 | 22540 | 18 (0)| 00:00:01 |
|* 10 | TABLE ACCESS FULL | tb_parts | 4310 | 155K| 7372 (2)| 00:01:29 |
| 11 | TABLE ACCESS BY INDEX ROWID | tb_strings | 1 | 87 | 1 (0)| 00:00:01 |
|* 12 | INDEX UNIQUE SCAN | pk_string | 1 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
---------------------------------------------------
4 - access("parts"."activityCode"="activities"."activityCode"
AND "parts"."concessionCode"="activities"."concessionCode")
5 - access("activities"."concessionCode"="concessions"."concessionCode")
8 - access("concessions"."concessionCode"='G9'
OR "concessions"."concessionCode"='JS0'
OR "concessions"."concessionCode"='TR'
OR "concessions"."concessionCode"='ZD')
9 - filter("activities"."concessionCode"='G9'
OR "activities"."concessionCode"='JS0'
OR "activities"."concessionCode"='TR'
OR "activities"."concessionCode"='ZD')
10 - filter("parts"."date">=TO_DATE(' 2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND "parts"."value"='E'
AND ("parts"."type"='M' OR "parts"."type"='U')
AND ("parts"."concessionCode"='G9'
OR "parts"."concessionCode"='JS0'
OR "parts"."concessionCode"='TR'
OR "parts"."concessionCode"='ZD')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss'))
12 - access("activities"."concessionCode"="strings"."concessionCode"
AND "activities"."activityCode"="strings"."activityCode"
AND "strings"."language"='ENG')
filter("strings"."concessionCode"='G9'
OR "strings"."concessionCode"='JS0'
OR "strings"."concessionCode"='TR'
OR "strings"."concessionCode"='ZD')
Run Code Online (Sandbox Code Playgroud)
最后六个值:
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 186 | 4525 (1)| 00:00:55 |
| 1 | SORT ORDER BY | | 1 | 186 | 4525 (1)| 00:00:55 |
| 2 | HASH GROUP BY | | 1 | 186 | 4525 (1)| 00:00:55 |
| 3 | NESTED LOOPS | | 1 | 186 | 4523 (1)| 00:00:55 |
|* 4 | HASH JOIN | | 9 | 891 | 4514 (1)| 00:00:55 |
|* 5 | HASH JOIN | | 136 | 8432 | 21 (5)| 00:00:01 |
| 6 | INLIST ITERATOR | | | | | |
| 7 | TABLE ACCESS BY INDEX ROWID| tb_concesions | 4 | 108 | 2 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | pk_concession | 4 | | 1 (0)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | tb_activities | 644 | 22540 | 18 (0)| 00:00:01 |
| 10 | INLIST ITERATOR | | | | | |
|* 11 | TABLE ACCESS BY INDEX ROWID | tb_parts | 2155 | 79735 | 4493 (1)| 00:00:54 |
|* 12 | INDEX RANGE SCAN | in_parts | 8620 | | 1277 (1)| 00:00:16 |
| 13 | TABLE ACCESS BY INDEX ROWID | tb_strings | 1 | 87 | 1 (0)| 00:00:01 |
|* 14 | INDEX UNIQUE SCAN | pk_string | 1 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("parts"."activityCode"="activities"."activityCode"
AND "parts"."concessionCode"="activities"."concessionCode")
5 - access("activities"."concessionCode"="concessions"."concessionCode")
8 - access("concessions"."concessionCode"='G9'
OR "concessions"."concessionCode"='JS0'
OR "concessions"."concessionCode"='TR'
OR "concessions"."concessionCode"='ZD')
9 - filter("activities"."concessionCode"='G9'
OR "activities"."concessionCode"='JS0'
OR "activities"."concessionCode"='TR'
OR "activities"."concessionCode"='ZD')
11 - filter("parts"."value"='E'
AND ("parts"."type"='M' OR "parts"."type"='U'))
12 - access(("parts"."concessionCode"='G9'
OR "parts"."concessionCode"='ZD')
AND "parts"."date">=TO_DATE(' 2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss'))
filter("parts"."date">=TO_DATE(' 2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss'))
14 - access("activities"."concessionCode"="strings"."concessionCode"
AND "activities"."activityCode"="strings"."activityCode"
AND "strings"."language"='ENG')
filter("strings"."concessionCode"='G9'
OR "strings"."concessionCode"='JS0'
OR "strings"."concessionCode"='TR'
OR "strings"."concessionCode"='ZD')
Run Code Online (Sandbox Code Playgroud)
由于这是我第一次与执行计划会面,我只能猜测延迟的原因是什么.在4到6个值之间,我猜这是从FULL ACCESS到INDCESS BY INDEX的变化.此外,当访问表时,四个值(id 10)的过滤器包含所有四个特许值; 而对于六个值,两个特许值在访问部分,而过滤器仅包含日期,类型和值.
一般来说,出现这种异常的原因是查询优化器无法准确预测成本。准确了解成本的唯一方法是使用不同的执行计划多次实际运行该语句。相反,它使用统计数据来估计成本,但有时估计是错误的。
当您比较“有两个值”和“有四个值”的执行计划时,您可以看到后者产生更高的成本,并且计划完全不同。优化器可以在这两个执行计划之间进行选择,并且一定认为第一个执行计划有两个值更好,第二个有四个值更好。然而,实际上,在这两种情况下,第二种都更好。
如果您仔细分析此类异常情况,您通常会得到一些见解,例如数据中某些值的组合被过度代表或被低估。在统计数据中使用直方图可以为优化器提供更多线索,并且可以更好地处理“倾斜数据”,但其预测能力仍然有限。
实际上,解决方案就是您所做的:重写 SQL,直到获得可接受的性能。通常“提示”(在 Oracle 中)也可以为优化器提供更多线索。