在执行任何其他SQL之前先执行子查询重构

psa*_*j12 8 sql oracle performance subquery query-optimization

我有一个非常复杂的视图,形式如下

create or replace view loan_vw as 
select * from (with loan_info as (select loan_table.*,commission_table.* 
                                   from loan_table,
                                  commission_table where 
                                  contract_id=commission_id)
                select /*complex transformations */ from loan_info
                where type <> 'PRINCIPAL'
                union all 
                select /*complex transformations */ from loan_info
                where type = 'PRINCIPAL')
Run Code Online (Sandbox Code Playgroud)

现在,如果我执行以下操作,则查询挂起

         select * from loan_vw where contract_id='HA001234TY56';
Run Code Online (Sandbox Code Playgroud)

但是,如果我在子查询重构中进行硬编码或在同一会话中使用包级变量,则查询将在第二秒返回

create or replace view loan_vw as 
        select * from (with loan_info as (select loan_table.*,commission_table.* 
                                           from loan_table,
                                          commission_table where 
                                          contract_id=commission_id
                                          and contract_id='HA001234TY56'
                                          )
                        select /*complex transformations */ from loan_info
                        where type <> 'PRINCIPAL'
                        union all 
                        select /*complex transformations */ from loan_info
                        where type = 'PRINCIPAL')
Run Code Online (Sandbox Code Playgroud)

由于我使用业务对象,因此无法使用包级变量

所以我的问题是在Oracle中有一个提示要告诉优化器首先在子查询重构中检查loan_vw中的contract_id

根据要求,使用的分析功能如下

select value_date, item, credit_entry, item_paid
from (
  select value_date, item, credit_entry, debit_entry,
    greatest(0, least(credit_entry, nvl(sum(debit_entry) over (), 0)
      - nvl(sum(credit_entry) over (order by value_date
          rows between unbounded preceding and 1 preceding), 0))) as item_paid
  from your_table
)
where item is not null;
Run Code Online (Sandbox Code Playgroud)

在遵循Boneist和MarcinJ的建议后,我删除了子查询重构(CTE)并编写了一个如下所示的长查询,该查询将性能从3分钟提高到0.156秒

  create or replace view loan_vw as
  select /*complex transformations */
                               from loan_table,
                              commission_table where 
                              contract_id=commission_id
               and loan_table.type <> 'PRINCIPAL'
  union all
  select /*complex transformations */
                               from loan_table,
                              commission_table where 
                              contract_id=commission_id
               and loan_table.type = 'PRINCIPAL'
Run Code Online (Sandbox Code Playgroud)

Mar*_*inJ 4

这些转换真的那么复杂吗UNION ALL?优化你看不到的东西确实很难,但你是否尝试过摆脱 CTE 并内联实现计算?

CREATE OR REPLACE VIEW loan_vw AS
SELECT loan.contract_id
     , CASE commission.type -- or wherever this comes from
         WHEN 'PRINCIPAL'
         THEN SUM(whatever) OVER (PARTITION BY loan.contract_id, loan.type) -- total_whatever

         ELSE SUM(something_else) OVER (PARTITION BY loan.contract_id, loan.type) -- total_something_else
      END AS whatever_something
  FROM loan_table loan 
 INNER 
  JOIN commission_table commission
    ON loan.contract_id = commission.commission_id
Run Code Online (Sandbox Code Playgroud)

请注意,如果您的分析函数没有,您将根本PARTITION BY contract_id无法在该列上使用索引。contract_id

看看这个数据库小提琴(您必须单击...最后一个结果表才能展开结果)。在这里,loan表有一个索引(PK)contract_id列,但some_other_id它也是唯一的,但没有索引,并且外部查询上的谓词仍然是 on contract_id。如果您比较 和partition by contract的计划partition by other id,您会发现计划中根本没有使用索引:与-相比,贷款表上partition by other id有一个TABLE ACCESSwith选项。这显然是因为优化器无法自行解析和之间的关系,因此它需要在整个窗口上运行或运行,而不是通过索引使用来限制窗口行数。FULLINDEXUNIQUE SCANpartition by contractcontract_idsome_other_idSUMAVG

如果您有包含这些合同的维度表,您还可以尝试将其连接到您的结果,并contract_id从维度表而不是最有可能的巨大贷款事实表中公开。有时,这可以通过在维度表上使用唯一索引来改进基数估计。

再说一次,在没有查询甚至没有计划的情况下,优化黑匣子真的很困难,所以我们不知道发生了什么。例如,CTE 或子查询可能会不必要地具体化。