Oracle top-n查询排序性能

Lex*_*Lex 6 sql oracle query-optimization oracle12c

我是sql优化的新手,我试图理解为什么IN子句中有多个项目会导致性能受到很大影响,如果可能的话,我怎么能阻止它.以下是我正在使用的或多或少的东西.第二个查询是我现在所拥有的,我正在寻求提高性能.在现实生活中,TABLE_1有数百万行,计划的排序部分的CPU成本为21M.

SELECT 
    TOPNWRAPPER.*, 
    TABLE_2.X, 
    TABLE_2.Y 
FROM 
    TABLE_2, 
    ( 
        SELECT 
            * 
        FROM 
            ( 
                SELECT 
                    /*+ index (TABLE_1 TABLE_1_B_E_F_ID) */ 
                    TABLE_1.ID, 
                    TABLE_1.C, 
                    TABLE_1.B, 
                    TABLE_1.E, 
                    TABLE_1.F
                FROM 
                    TABLE_1 
                WHERE 
                    ( TABLE_1.F IN ( ‘STATE1’ ) ) AND 
                    ( TABLE_1.B= 'SOMETEXT' ) AND 
                    ( TABLE_1.C=1 ) AND 
                    ( TABLE_1.E= 'IN' ) AND 
                    ( TABLE_1.D IS NULL ) 
                ORDER BY 
                    TABLE_1.ID 
            ) 
        WHERE 
            ( ROWNUM <= 150 ) 
    ) TOPNWRAPPER 
WHERE 
    ( TOPNWRAPPER.ID = TABLE_2.T1_ID_FK ) 
ORDER BY 
    TOPNWRAPPER.ID ASC
Run Code Online (Sandbox Code Playgroud)

统计:

|--------------------------------------------------------------------------------------------------------------------------|
|| Id  | Operation                        | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers ||
|--------------------------------------------------------------------------------------------------------------------------|
||   0 ||SELECT STATEMENT                 |                             |      1 |        |    120 |00:00:00.01 |     965 ||
||   1 |||NESTED LOOPS                    |                             |      1 |        |    120 |00:00:00.01 |     965 ||
||   2 ||||NESTED LOOPS                   |                             |      1 |      1 |    120 |00:00:00.01 |     845 ||
||   3 |||||VIEW                          |                             |      1 |      1 |    120 |00:00:00.01 |     245 ||
||*  4 ||||||COUNT STOPKEY                |                             |      1 |        |    120 |00:00:00.01 |     245 ||
||   5 |||||||VIEW                        |                             |      1 |      1 |    120 |00:00:00.01 |     245 ||
||*  6 ||||||||TABLE ACCESS BY INDEX ROWID| TABLE_1                     |      1 |      1 |    120 |00:00:00.01 |     245 ||
||*  7 |||||||||INDEX RANGE SCAN          | TABLE_1_B_E_F_ID            |      1 |     25 |    120 |00:00:00.01 |     125 ||
||*  8 |||||INDEX RANGE SCAN              | TABLE_2_T1_ID_FK            |    120 |      1 |    120 |00:00:00.01 |     600 ||
||   9 ||||TABLE ACCESS BY INDEX ROWID    | TABLE_2                     |    120 |      1 |    120 |00:00:00.01 |     120 ||
|--------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                          |
|Predicate Information (identified by operation id):                                                                       |
|---------------------------------------------------                                                                       |
|                                                                                                                          |
|   4 - filter(ROWNUM<=150)                                                                                                |
|   6 - filter((“TABLE_1”.”C”=1 AND “TABLE_1”.”D” IS NULL))                                                                |
|   7 - access(“TABLE_1”.”B”='SOMETEXT' AND                                                                                |
|              “TABLE_1”.”E”=‘IN' AND “TABLE_1”.”F”=’STATE1’)                                                              |
|   8 - access(“TOPNWRAPPER”.”ID”=“TABLE_2”.”T1_ID_FK”)                                                                         |
+--------------------------------------------------------------------------------------------------------------------------+
Run Code Online (Sandbox Code Playgroud)

当我在IN子句中更新查询以获得"STATE2"时,会向计划中添加一个额外的排序步骤.

SELECT 
    TOPNWRAPPER.*, 
    TABLE_2.X, 
    TABLE_2.Y 
FROM 
    TABLE_2, 
    ( 
        SELECT 
            * 
        FROM 
            ( 
                SELECT 
                    /*+ index (TABLE_1 TABLE_1_B_E_F_ID) */ 
                    TABLE_1.ID, 
                    TABLE_1.C, 
                    TABLE_1.B, 
                    TABLE_1.E, 
                    TABLE_1.F
                FROM 
                    TABLE_1 
                WHERE 
                    ( TABLE_1.F IN ( 'STATE1', 'STATE2' ) ) AND 
                    ( TABLE_1.B= 'SOMETEXT' ) AND 
                    ( TABLE_1.C=1 ) AND 
                    ( TABLE_1.E= 'IN' ) AND 
                    ( TABLE_1.D IS NULL ) 
                ORDER BY 
                    TABLE_1.ID 
            ) 
        WHERE 
            ( ROWNUM <= 150 ) 
    ) TOPNWRAPPER 
WHERE 
    ( TOPNWRAPPER.ID = TABLE_2.T1_ID_FK ) 
ORDER BY 
    TOPNWRAPPER.ID ASC
Run Code Online (Sandbox Code Playgroud)

统计:

|-------------------------------------------------------------------------------------------------------------------------------------------------------|
|| Id  | Operation                          | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem ||
|-------------------------------------------------------------------------------------------------------------------------------------------------------|
||   0 ||SELECT STATEMENT                   |                             |      1 |        |    150 |00:00:00.01 |    1076 |       |       |          ||
||   1 |||NESTED LOOPS                      |                             |      1 |        |    150 |00:00:00.01 |    1076 |       |       |          ||
||   2 ||||NESTED LOOPS                     |                             |      1 |      1 |    150 |00:00:00.01 |     926 |       |       |          ||
||   3 |||||VIEW                            |                             |      1 |      1 |    150 |00:00:00.01 |     176 |       |       |          ||
||*  4 ||||||COUNT STOPKEY                  |                             |      1 |        |    150 |00:00:00.01 |     176 |       |       |          ||
||   5 |||||||VIEW                          |                             |      1 |      1 |    150 |00:00:00.01 |     176 |       |       |          ||
||*  6 ||||||||SORT ORDER BY STOPKEY        |                             |      1 |      1 |    150 |00:00:00.01 |     176 | 15360 | 15360 |14336  (0)||
||   7 |||||||||INLIST ITERATOR             |                             |      1 |        |    165 |00:00:00.01 |     176 |       |       |          ||
||*  8 ||||||||||TABLE ACCESS BY INDEX ROWID| TABLE_1                     |      2 |      1 |    165 |00:00:00.01 |     176 |       |       |          ||
||*  9 |||||||||||INDEX RANGE SCAN          | TABLE_1_B_E_F_ID            |      2 |     50 |    165 |00:00:00.01 |      11 |       |       |          ||
||* 10 |||||INDEX RANGE SCAN                | TABLE_2_T1_ID_FK            |    150 |      1 |    150 |00:00:00.01 |     750 |       |       |          ||
||  11 ||||TABLE ACCESS BY INDEX ROWID      | TABLE_2                     |    150 |      1 |    150 |00:00:00.01 |     150 |       |       |          ||
|-------------------------------------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                                                       |
|Predicate Information (identified by operation id):                                                                                                    |
|---------------------------------------------------                                                                                                    |
|                                                                                                                                                       |
|   4 - filter(ROWNUM<=150)                                                                                                                             |
|   6 - filter(ROWNUM<=150)                                                                                                                             |
|   8 - filter((“TABLE_1”.”C”=1 AND “TABLE_1”.”D” IS NULL))                                         |
|   9 - access(“TABLE_1”.”B”='SOMETEXT' AND                                                                                |
|              “TABLE_1”.”E”='IN' AND ((“TABLE_1”.”F”='STATE1') OR (“TABLE_1”.”F”='STATE2'))                                              |
|  10 - access(“TOPNWRAPPER”.”ID”=“TABLE_2”.”T1_ID_FK”)                                                              |
|                                                                                                                                                       |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
Run Code Online (Sandbox Code Playgroud)

我一直在寻找几天.我尝试过的一个建议就是使用提示/*+ USE_CONCAT (OR_PREDICATES(1)) */,这有助于将mem使用量减少一半,但它并没有完全消除问题.

编辑:环顾四周(http://use-the-index-luke.com/sql/sorting-grouping/indexed-order-by#tip-ixord-full)并认为这可能是由于订单.如果我改变由语句顺序是:TABLE_1.F,TABLE_1.IDTOPNWRAPPER.F,TOPNWRAPPER.ID ASC排序操作消失那么,不幸的是我需要的ID的前n行.或者,我尝试在(ID F)上创建一个新索引进行测试,它也删除了排序操作,但每行ID是唯一的,这使得表访问操作效率降低.

编辑2:

OPERATION      |OPTION           |CPU COST
--------------------------------------------
SORT           |ORDER BY STOPKEY |21042774
|NESTED LOOPS  |OUTER            |56052
||TABLE ACCESS |BY INDEX ROWID   |38980
|||INDEX       |RANGE SCAN       |30086
Run Code Online (Sandbox Code Playgroud)

Jon*_*ler 2

性能差异可能并不重要。执行计划的差异是因为如果前导列使用相等条件,则多列索引访问只会隐式排序。

性能差异

不要太担心执行计划的成本。尽管它被称为“基于成本的优化器”,但成本是一个奇怪的数字,世界上只有少数人完全理解。

比较解释计划成本很复杂的原因之一是总成本有时小于子操作成本之一。正如我在此处的回答中所解释的,这可能会在COUNT STOPKEY手术中发生。这是 Oracle 的说法,“这个子操作花费如此巨大的费用,但 COUNT STOPKEY 可能会在它达到那么高之前将其切断”。通常最好比较计划的最高成本,但即使是这个数字有时也可能会产生误导,正如该答案中的其他示例所表明的那样。

这意味着通常运行时间是唯一重要的事情。如果两个查询的 A-Time(实际时间)仅为 0.1 秒,那么您的工作可能已经完成。

执行计划差异

执行计划的差异是由多列索引的存储和访问方式造成的。有时扫描索引时,结果会自动存储,有时则不会。这就是为什么一个计划有COUNT STOPKEY而另一个计划更昂贵SORT ORDER BY STOPKEY

为了演示此计划差异,请创建一个仅包含 2 列和 4 行的简单表和索引:

create table test1 as 
select 1 a, 10 b from dual union all
select 1 a, 30 b from dual union all
select 2 a, 20 b from dual union all
select 2 a, 40 b from dual;

create index test1_idx on test1(a, b);

begin
    dbms_stats.gather_table_stats(user, 'TEST1');
end;
/
Run Code Online (Sandbox Code Playgroud)

下面是如何存储索引的简化想法。数据首先按前导列排序存储,然后按尾随列排序。

create table test1 as 
select 1 a, 10 b from dual union all
select 1 a, 30 b from dual union all
select 2 a, 20 b from dual union all
select 2 a, 40 b from dual;

create index test1_idx on test1(a, b);

begin
    dbms_stats.gather_table_stats(user, 'TEST1');
end;
/
Run Code Online (Sandbox Code Playgroud)

如果查询仅访问前导列 A 中的一个值,那么它可以按顺序读取列 B 中的值,而无需任何额外的工作。Oracle 无需尝试就先访问 A 块之一,然后按顺序读取 B 块。

请注意此查询如何有一个但执行计划中ORDER BY没有。SORT

explain plan for select * from test1 where a = 1 and b > 0 order by b;
select * from table(dbms_xplan.display(format => 'basic'));

Plan hash value: 598212486

--------------------------------------
| Id  | Operation        | Name      |
--------------------------------------
|   0 | SELECT STATEMENT |           |
|   1 |  INDEX RANGE SCAN| TEST1_IDX |
--------------------------------------
Run Code Online (Sandbox Code Playgroud)

但是,如果查询访问前导列 A 中的多个值,则不一定会按顺序检索 B 中的结果。Oracle 可以按顺序读取 A 块,但 B 块顺序仅适用于一个A 值。

现在,SORT ORDER BY执行计划中出现了一个额外的操作。

explain plan for select * from test1 where a in (1,2) and b > 0 order by b;
select * from table(dbms_xplan.display(format => 'basic'));

Plan hash value: 704117715

----------------------------------------
| Id  | Operation          | Name      |
----------------------------------------
|   0 | SELECT STATEMENT   |           |
|   1 |  SORT ORDER BY     |           |
|   2 |   INLIST ITERATOR  |           |
|   3 |    INDEX RANGE SCAN| TEST1_IDX |
----------------------------------------
Run Code Online (Sandbox Code Playgroud)

这就是为什么更改column1 = value1column1 in (value1, value2)可能会增加额外的SORT操作。