Ron*_*nis 9 sql database oracle cost-based-optimizer anchor-modeling
我能够将连接消除工作用于简单的情况,例如一对一的关系,但不能用于稍微复杂的场景.最后我想尝试锚建模,但首先我需要找到解决这个问题的方法.我正在使用Oracle 12c企业版第12.1.0.2.0版.
我的测试用例的DDL:
drop view product_5nf;
drop table product_color cascade constraints;
drop table product_price cascade constraints;
drop table product cascade constraints;
create table product(
product_id number not null
,constraint product_pk primary key(product_id)
);
create table product_color(
product_id number not null references product
,color varchar2(10) not null
,constraint product_color_pk primary key(product_id)
);
create table product_price(
product_id number not null references product
,from_date date not null
,price number not null
,constraint product_price_pk primary key(product_id, from_date)
);
Run Code Online (Sandbox Code Playgroud)
一些示例数据:
insert into product values(1);
insert into product values(2);
insert into product values(3);
insert into product values(4);
insert into product_color values(1, 'Red');
insert into product_color values(2, 'Green');
insert into product_price values(1, date '2016-01-01', 10);
insert into product_price values(1, date '2016-02-01', 8);
insert into product_price values(1, date '2016-05-01', 5);
insert into product_price values(2, date '2016-02-01', 5);
insert into product_price values(4, date '2016-01-01', 10);
commit;
Run Code Online (Sandbox Code Playgroud)
第一个视图不编译 - 它与ORA-01799失败:列可能不是外部连接到子查询.不幸的是,当我查看锚建模的在线示例时,这就是大多数历史视图的定义...
create view product_5nf as
select p.product_id
,pc.color
,pp.price
from product p
left join product_color pc on(
pc.product_id = p.product_id
)
left join product_price pp on(
pp.product_id = p.product_id
and pp.from_date = (select max(pp2.from_date)
from product_price pp2
where pp2.product_id = pp.product_id)
);
Run Code Online (Sandbox Code Playgroud)
以下是我修复它的尝试.通过简单的选择使用此视图时product_id,Oracle设法消除product_color而不是 product_price.
create view product_5nf as
select product_id
,pc.color
,pp.price
from product p
left join product_color pc using(product_id)
left join (select pp1.product_id, pp1.price
from product_price pp1
where pp1.from_date = (select max(pp2.from_date)
from product_price pp2
where pp2.product_id = pp1.product_id)
)pp using(product_id);
select product_id
from product_5nf;
----------------------------------------------------------
| Id | Operation | Name | Rows |
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 |
|* 1 | HASH JOIN OUTER | | 4 |
| 2 | INDEX FAST FULL SCAN| PRODUCT_PK | 4 |
| 3 | VIEW | | 3 |
| 4 | NESTED LOOPS | | 3 |
| 5 | VIEW | VW_SQ_1 | 5 |
| 6 | HASH GROUP BY | | 5 |
| 7 | INDEX FULL SCAN | PRODUCT_PRICE_PK | 5 |
|* 8 | INDEX UNIQUE SCAN | PRODUCT_PRICE_PK | 1 |
----------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
我发现的唯一解决方案是使用标量子查询,如下所示:
create or replace view product_5nf as
select p.product_id
,pc.color
,(select pp.price
from product_price pp
where pp.product_id = p.product_id
and pp.from_date = (select max(from_date)
from product_price pp2
where pp2.product_id = pp.product_id)) as price
from product p
left join product_color pc on(
pc.product_id = p.product_id
)
select product_id
from product_5nf;
---------------------------------------------------
| Id | Operation | Name | Rows |
---------------------------------------------------
| 0 | SELECT STATEMENT | | 4 |
| 1 | INDEX FAST FULL SCAN| PRODUCT_PK | 4 |
---------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
现在Oracle成功地删除了product_price表.但是,标量子查询的实现方式与连接不同,执行它们的方式根本不允许我在现实场景中获得任何可接受的性能.
TL; DR
如何重写视图product_5nf以便Oracle成功地消除了两个依赖表?
我认为你这里有两个问题。
首先,连接消除仅适用于某些特定情况(PK-PK、PK-FK 等)。LEFT JOIN对于任何将为每个连接键值返回单行并让 Oracle 消除连接的行集,这并不是一件常见的事情。
其次,即使 Oracle 足够先进,可以在 ANY 上进行连接消除LEFT JOIN,它知道每个连接键值只能获得一行,Oracle 还不支持LEFT JOINS基于组合键的连接消除(Oracle 支持文档 887553.1 中提到了这一点)将在 R12.2 中推出)。
您可以考虑的一种解决方法是具体化每个视图的最后一行product_id。然后LEFT JOIN到物化视图。像这样:
create table product(
product_id number not null
,constraint product_pk primary key(product_id)
);
create table product_color(
product_id number not null references product
,color varchar2(10) not null
,constraint product_color_pk primary key(product_id)
);
create table product_price(
product_id number not null references product
,from_date date not null
,price number not null
,constraint product_price_pk primary key (product_id, from_date )
);
-- Add a VIRTUAL column to PRODUCT_PRICE so that we can get all the data for
-- the latest row by taking the MAX() of this column.
alter table product_price add ( sortable_row varchar2(80) generated always as ( lpad(product_id,10,'0') || to_char(from_date,'YYYYMMDDHH24MISS') || lpad(price,10,'0')) virtual not null );
-- Create a MV snapshot so we can materialize a view having only the latest
-- row for each product_id and can refresh that MV fast on commit.
create materialized view log on product_price with sequence, primary key, rowid ( price ) including new values;
-- Create the MV
create materialized view product_price_latest refresh fast on commit enable query rewrite as
SELECT product_id, max( lpad(product_id,10,'0') || to_char(from_date,'YYYYMMDDHH24MISS') || lpad(price,10,'0')) sortable_row
FROM product_price
GROUP BY product_id;
-- Create a primary key on the MV, so we can do join elimination
alter table product_price_latest add constraint ppl_pk primary key ( product_id );
-- Insert the OP's test data
insert into product values(1);
insert into product values(2);
insert into product values(3);
insert into product values(4);
insert into product_color values(1, 'Red');
insert into product_color values(2, 'Green');
insert into product_price ( product_id, from_date, price ) values(1, date '2016-01-01', 10 );
insert into product_price ( product_id, from_date, price) values(1, date '2016-02-01', 8);
insert into product_price ( product_id, from_date, price) values(1, date '2016-05-01', 5);
insert into product_price ( product_id, from_date, price) values(2, date '2016-02-01', 5);
insert into product_price ( product_id, from_date, price) values(4, date '2016-01-01', 10);
commit;
-- Create the 5NF view using the materialized view
create or replace view product_5nf as
select p.product_id
,pc.color
,to_date(substr(ppl.sortable_row,11,14),'YYYYMMDDHH24MISS') from_date
,to_number(substr(ppl.sortable_row,25)) price
from product p
left join product_color pc on pc.product_id = p.product_id
left join product_price_latest ppl on ppl.product_id = p.product_id
;
-- The plan for this should not include any of the unnecessary tables.
select product_id from product_5nf;
-- Check the plan
SELECT *
FROM TABLE (DBMS_XPLAN.display_cursor (null, null,
'ALLSTATS LAST'));
------------------------------------------------
| Id | Operation | Name | E-Rows |
------------------------------------------------
| 0 | SELECT STATEMENT | | |
| 1 | INDEX FULL SCAN | PRODUCT_PK | 1 |
------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
715 次 |
| 最近记录: |