帮助理解Oracle中的解释计划

fil*_*ppo 6 performance oracle oracle-11g-r2 explain query-performance

我正在一些大表中运行查询,尽管它运行良好,即使是大量数据也很困难,但我想了解它的哪一部分对执行有影响。不幸的是,我不太擅长解释计划,所以我寻求帮助。

以下是有关这些表的一些数据:

  • history_state_table 7.424.65行(其中只有13.412被后留下的t1.alarm_type = 'AT1'
  • costumer_price_history 448.284.169
  • cycle_table 215

这将是查询(不要介意逻辑,仅供参考):

SELECT t1.id_alarm, t2.load_id, t2.reference_date
  FROM history_state_table t1,
       (SELECT   op_code, contract_num,
                 COUNT (DISTINCT id_ponto) AS num_pontos,
                 COUNT
                    (DISTINCT CASE
                        WHEN vlr > 0
                           THEN id_ponto
                        ELSE NULL
                     END
                    ) AS bigger_than_zero,
                 MAX (load_id) AS load_id,
                 MAX (reference_date) AS reference_date
            FROM costumer_price_history
           WHERE load_id IN
                            (42232, 42234, 42236, 42238, 42240, 42242, 42244) /* arbitrary IDs depending on execution*/
             AND sistema = 'F1'          /* Hardcoded filters */
             AND rec_type = 'F3'         /* Hardcoded filters */
             AND description = 'F3'      /* Hardcoded filters */
             AND extract_type IN
                    ('T1', 'T2', 'T3')
        GROUP BY op_code, contract_num) t2
 WHERE t1.op_code = t2.op_code
   AND t1.contract_num = t2.contract_num
   AND t1.alarm_type = 'AT1'
   AND t1.alarm_status = 'DONE'
   AND (   (    t1.prod_type = 'COMBO'
            AND t2.bigger_than_zero = t2.num_pontos - 1
           )
        OR (    t1.prod_type != 'COMBO'
            AND t2.bigger_than_zero = t2.num_pontos
           )
       )
       /* arbitrary filter depending on execution*/
   AND t1.data_tratado BETWEEN (SELECT data_inicio
                                  FROM cycle_table
                                 WHERE id_ciclo = 160) AND (SELECT data_fim
                                                              FROM cycle_table
                                                             WHERE id_ciclo =
                                                                           160)
Run Code Online (Sandbox Code Playgroud)

最后是解释计划:

Plan
SELECT STATEMENT  ALL_ROWSCost: 5,485                           
    13 NESTED LOOPS                         
        7 NESTED LOOPS  Cost: 5,483  Bytes: 115  Cardinality: 1                     
            5 VIEW  Cost: 12  Bytes: 59  Cardinality: 1                 
                4 SORT GROUP BY  Cost: 12  Bytes: 85  Cardinality: 1            
                    3 INLIST ITERATOR       
                        2 TABLE ACCESS BY INDEX ROWID TABLE RAIDPIDAT.COSTUMER_PRICE_HISTORY Cost: 11  Bytes: 85  Cardinality: 1    
                            1 INDEX RANGE SCAN INDEX RAIDPIDAT.IDX_COSTUMER_PRICE_HISTORY_2 Cost: 10  Cardinality: 3  
            6 INDEX RANGE SCAN INDEX RAIDPIDAT.IDX_HISTORY_STATE_TABLE_1TPALM Cost: 662  Cardinality: 102,068               
        12 TABLE ACCESS BY INDEX ROWID TABLE RAIDPIDAT.HISTORY_STATE_TABLE Cost: 5,471  Bytes: 56  Cardinality: 1                   
            9 TABLE ACCESS BY INDEX ROWID TABLE RAIDPIDAT.CYCLE_TABLE Cost: 1  Bytes: 12  Cardinality: 1                
                8 INDEX UNIQUE SCAN INDEX (UNIQUE) RAIDPIDAT.PK_CYCLE_TABLE Cost: 0  Cardinality: 1             
            11 TABLE ACCESS BY INDEX ROWID TABLE RAIDPIDAT.CYCLE_TABLE Cost: 1  Bytes: 12  Cardinality: 1               
                10 INDEX UNIQUE SCAN INDEX (UNIQUE) RAIDPIDAT.PK_CYCLE_TABLE Cost: 0  Cardinality: 1    
Run Code Online (Sandbox Code Playgroud)

请注意,我不是在问“如何更有效地重写它”,而是我如何在没有解释计划的情况下找到最昂贵的操作。同时我正在阅读它,但我很感激一些帮助。

kub*_*zyk 3

解释计划并没有告诉你什么是实际上成本最高的“操作”。“成本”列是一个猜测- 它是优化器估计的值。“基数”列和“字节”列也是如此。http://docs.oracle.com/cd/B28359_01/server.111/b28274/ex_plan.htm#i18300

在您的示例中,优化器告诉您:我决定使用此计划,因为我猜测循环将花费大约 5,483。我希望这将是执行过程中成本最高的部分,但我不能保证这一点。

这同样递归地适用于树的所有深度。

如果深入到最低级别(即直觉上循环次数最多、执行次数最多的级别),您会发现特别突出的操作(无论是在预期成本还是预期元素数量方面)都是

6 INDEX RANGE SCAN INDEX RAIDPIDAT.IDX_HISTORY_STATE_TABLE_1TPALM Cost: 662  Cardinality: 102,068 
Run Code Online (Sandbox Code Playgroud)

因此,优化器猜测此查询的最佳执行是围绕较差的主力 RAIDPIDAT.IDX_HISTORY_STATE_TABLE_1TPALM 进行大量循环。我真的看不出您的查询的哪一部分与其直接相关,但我怀疑 t1.data_tratado 条件。再说一遍,我不知道它是否真的最昂贵的部分。

我将尝试将解释计划中的循环语法转换为过程伪代码:

/* begin step 13 (by "step 13" I mean a line that reads "   13 NESTED LOOPS") */
  /* begin step 7 */
    do step 5
    myresult = rows from step 5
    for each row from myresult {
       do step 6
       for each row from step 6 {
           join to a row from myresult the matching row from step 6
       }
    }
  /* end step 7 */
  for each row from myresult {
     do step 12
     for each row from step 12 {
         join to a row from myresult the matching row from step 12
     }
  }
/* end step 13 */
return myresult
Run Code Online (Sandbox Code Playgroud)

看起来很复杂,但每个“嵌套循环”的真正目的是以最简单的方式创建一个联接(由两个表组成的单个表),即循环中的循环。