Lei*_*fel 5 index oracle oracle-11g-r2 oracle-11g
通过执行以下操作,我可以确定索引是否会受益于压缩以及压缩中应包含多少列:
ANALYZE INDEX Owner.IndexName VALIDATE STRUCTURE OFFLINE;
SELECT Opt_Cmpr_PctSave, Opt_Cmpr_Count FROM Index_Stats;
Run Code Online (Sandbox Code Playgroud)
问题是当OFFLINE
更改为ONLINE
Index_Stats 视图时不会填充。是否有一种在线方法可以确定压缩索引的好处和/或将产生最佳压缩的列数?
更新:
http://jonathanlewis.wordpress.com/index-definitions/表示如果 DBA_Indexes 中的 Distinct_Keys 比 num_rows“小很多”,那么该索引是一个很好的压缩候选者。这对某些人有帮助,但不是确定的,也无助于确定列数。他确实为此提供了一些指导方针,但没有一堆动态 SQL 就无法以编程方式确定。
要压缩的最佳列数取决于:
这些因素可以通过表格进行估计
目的是最大化压缩前缀的大小,同时最小化保存具有相同前缀的所有行所需的块数量。
假设数据至少在一定程度上是一致的,并且忽略压缩引入的少量开销,您可以尝试像这样实现这种方法:
辅助函数:
create or replace function f_size( p_table_name in varchar,
p_column_name in varchar)
return number as
n number;
begin
execute immediate
'select avg(vsize('||p_column_name||'))+1 from '||p_table_name into n;
return n;
end;
/
create or replace function f_count( p_table_name in varchar,
p_column_names in varchar )
return integer as
n integer;
begin
execute immediate 'select count(*) '||
'from ( select '|| p_column_names ||
' from '||p_table_name||' '||
'group by '||p_column_names||' )'
into n;
return n;
end;
/
Run Code Online (Sandbox Code Playgroud)
测试物联网:
create table t ( k1, k2, k3, k4, k5, val,
constraint pk_t primary key(k1, k2, k3, k4, k5))
organization index as
select mod(k,10)||'_____',
mod(k,20)||'_____',
mod(k,30)||'_____',
mod(k,50)||'_____',
k||'_____',
lpad(' ',100)
from (select level as k from dual connect by level<=1000);
Run Code Online (Sandbox Code Playgroud)
询问:
with utc as (select table_name, column_name, f_size(table_name, column_name) as column_size from user_tab_columns where table_name='T'),
uic as (select table_name, column_name, column_position, column_size from user_ind_columns join utc using(table_name, column_name) where index_name='PK_T')
select z.*, (8192-prefix_size*prefixes_per_block)/remaining_size as rows_per_block
from( select z.*, greatest(1,8192/(prefix_size+rows_per_prefix*remaining_size)) as prefixes_per_block
from( select z.*, total_count/distinct_count as rows_per_prefix
from( select prefix_length, sum(column_size) as prefix_size, (select sum(column_size) from utc)-sum(column_size) as remaining_size, f_count(table_name, max(prefix_columns)) as distinct_count,
(select count(*) from t) as total_count
from( select table_name, connect_by_root column_position as prefix_length, column_size, substr(sys_connect_by_path(column_name, ','),2) as prefix_columns
from uic
connect by column_position=(prior column_position-1) )
group by table_name, prefix_length ) z ) z ) z
order by 1;
Run Code Online (Sandbox Code Playgroud)
结果:
PREFIX_LENGTH PREFIX_SIZE REMAINING_SIZE DISTINCT_COUNT TOTAL_COUNT ROWS_PER_PREFIX PREFIXES_PER_BLOCK ROWS_PER_BLOCK
---------------------- ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- ----------------------
1 7 132.854 10 1000 100 1 61.608
2 14.5 125.354 20 1000 50 1.304 65.200
3 22.161 117.693 60 1000 16.666 4.129 68.827
4 29.961 109.893 300 1000 3.333 20.672 68.909
5 38.854 101 1000 1000 1 58.575 58.575
Run Code Online (Sandbox Code Playgroud)
查看:
analyze index pk_t validate structure;
select opt_cmpr_pctsave, opt_cmpr_count from index_stats;
OPT_CMPR_PCTSAVE OPT_CMPR_COUNT
---------------------- ----------------------
13 3
Run Code Online (Sandbox Code Playgroud)
rows_per_block
上面的检查大致对应于计算中最大值的前缀长度- 但我建议您在相信它之前仔细检查我的工作:)
我假设该表太大,您不能只复制一份并尝试不同的前缀长度。另一种方法是在数据样本上执行此操作 - 样本应选择为给定压缩候选的前缀的随机选择(而不仅仅是行的随机选择)
归档时间: |
|
查看次数: |
3835 次 |
最近记录: |