即使只提取了一条记录,Oracle也会投入大量的I/O.

Luk*_*der 13 sql oracle performance oracle11g sql-execution-plan

我经常在Oracle执行计划中遇到以下情况:

Operation                   | Object  | Order | Rows | Bytes | Projection
----------------------------+---------+-------+------+-------+-------------
TABLE ACCESS BY INDEX ROWID | PROD    |     7 |   2M |   28M | PROD.VALUE
  INDEX UNIQUE SCAN         | PROD_PK |     6 |   1  |       | PROD.ROWID
Run Code Online (Sandbox Code Playgroud)

这是一个更大的执行计划的摘录.基本上,我正在使用表的主键访问(加入)表.通常,存在另一个表ACCOACCO.PROD_ID = PROD.ID,其中PROD_PK是主关键字PROD.ID.显然,可以使用a访问该表UNIQUE SCAN,但是只要我在该表上有一些愚蠢的投影,似乎整个表(大约200万行)计划在内存中读取.我得到了很多I/O和缓冲区.当我从更大的查询中删除投影时,问题消失:

Operation                   | Object  | Order | Rows | Bytes | Projection
----------------------------+---------+-------+------+-------+-------------
TABLE ACCESS BY INDEX ROWID | PROD    |     7 |   1  |     8 | PROD.ID
  INDEX UNIQUE SCAN         | PROD_PK |     6 |   1  |       | PROD.ROWID
Run Code Online (Sandbox Code Playgroud)

我不明白这种行为.可能是什么原因?注意,我无法发布完整的查询.它相当复杂,涉及大量计算.然而,模式通常是相同的.

更新:我设法将我相当复杂的设置归结为一个简单的模拟,在两种情况下(投影时PROD.VALUE或离开时)都会产生类似的执行计划:

创建以下数据库:

-- products have a value
create table prod as
select level as id, 10 as value from dual 
connect by level < 100000;
alter table prod add constraint prod_pk primary key (id);

-- some products are accounts
create table acco as
select level as id, level as prod_id from dual 
connect by level < 50000;
alter table acco 
  add constraint acco_pk primary key (id);
alter table acco 
  add constraint acco_prod_fk foreign key (prod_id) references prod (id);

-- accounts have transactions with values
create table trxs as
select level as id, mod(level, 10) + 1 as acco_id, mod(level, 17) + 1 as value
from dual connect by level < 100000;
alter table trxs 
  add constraint trxs_pk primary key (id);
alter table trxs 
  add constraint trxs_acco_fk foreign key (acco_id) references acco (id);

create index acco_i on acco(prod_id);
create index trxs_i on trxs(acco_id);

alter table acco modify prod_id not null;
alter table trxs modify acco_id not null;
Run Code Online (Sandbox Code Playgroud)

运行以下查询

select v2.*
from (
  select 
    -- This calculates the balance for every transaction as a
    -- running total, subtracting trxs.value from the product's value
    --
    -- This is the "projection" I mentioned that causes I/O. Leaving it
    -- away (setting it to 0), would improve the execution plan
    prod.value - v1.total balance,
    acco.id acco_id
  from (
    select 
      acco_id,
      sum(value) over (partition by acco_id
                       order by id
                       rows between unbounded preceding 
                       and current row) total
    from trxs
  ) v1
  join acco on v1.acco_id = acco.id
  join prod on acco.prod_id = prod.id
) v2
-- This is the single-row access predicate. From here, it is
-- clear that there can only be 1 acco and 1 prod
where v2.acco_id = 1;
Run Code Online (Sandbox Code Playgroud)

分析

在分析上述查询的执行计划时(有或没有任何prod.value投影),我可以在访问prod表时在计划中重现过多的行/字节.

我找到了解决此问题方法.但我真的很感兴趣的是解释出现了什么问题以及如何在不改变查询的情况下纠正这个问题

更新

好的,经过更多的分析,我不得不说实际有问题的I/O是由于错误的索引在其他地方被使用.不幸的是,这在整体统计(或执行计划)中的预测并不充分.

就这个问题而言,我仍然对执行计划中的预计I/O感到好奇,因为这似乎让我们的DBA(和我)一次又一次地混淆.有时,它确实 I/O问题的根源......

Luk*_*der 0

值得注意的是,我已经检查了各种场景,包括特定示例的特定解决方案。将示例查询重新表述为这样可以解决这种情况下的问题:

select
  -- Explicitly project value in a nested loop. This seems to be much cheaper
  -- in this specific case
  (select value from prod where id = v2.prod_id) - v2.balance,
  v2.acco_id
from (
  select 
    -- Now, balance is only a running total, not the running total
    -- added to PROD.VALUE
    v1.total balance,
    acco.id acco_id,
    acco.prod_id prod_id
  from (
    select 
      acco_id,
      sum(value) over (partition by acco_id
                       order by id
                       rows between unbounded preceding 
                       and current row) total
    from trxs
  ) v1
  -- The JOIN of PROD is no longer needed
  join acco on v1.acco_id = acco.id
) v2
where v2.acco_id = 1;
Run Code Online (Sandbox Code Playgroud)

但我仍然不明白为什么 Oracle 会在其执行计划中投影如此多的 I/O,如果我prod早点加入这个查询......