有效地找到不同的值

Question

有效地找到不同的值

我有许多带有主键（月、年、数字）的表，不同的基数有所不同。对于元组（月，年），历史不会回溯很远，从长远来看，这可能不会超过 50。对于每个（月、年）元组，不超过 200 万个唯一数字。我想知道哪些月份和年份的组合可用。我使用此查询执行此操作：

select month, year from table group by month, year

这将返回正确的结果，但似乎效率不高。获得此结果的有效方法是什么（利用唯一索引）？

调优顾问建议为该查询添加月-年索引，但这似乎很浪费，因为已经有更大的索引可用。

Answer 1

Jac*_*las 5

您可以使用以下技术的变体 - 强制重复“最小/最大”范围扫描：

假设

您可以生成所有可能的年/月组合的列表
number 不为空（它不能像在 PK 中那样，但我提到它是因为如果允许空值，有一种解决方法）

试验台：

create table foo(month, year, num, primary key(month, year, num)) as
with m as ( select extract(month from d) as month, extract(year from d) as year
            from (select add_months(sysdate,1-level) as d from dual connect by level<50) )
select month, year, num
from m cross join 
     (select level as num from dual connect by level<100000 order by dbms_random.random());

Run Code Online (Sandbox Code Playgroud)

正常查询：

select distinct month, year from foo;
--gets=11656

Run Code Online (Sandbox Code Playgroud)

最小/最大技术：

with m as ( select extract(month from d) as month, extract(year from d) as year
            from (select add_months(sysdate,1-level) as d from dual connect by level<50) )
select month, year, decode(( select min(num)
                             from foo
                             where month=m.month and year=m.year )
                           ,null, 'N', 'Y') as has_data_yn
from m;
--gets=294

Run Code Online (Sandbox Code Playgroud)

针对评论的一些解释：

在每种情况下（测试台和最小/最大查询），子查询分解子句只生成一个（年，月）元组列表：

with m as ( select extract(month from d) as month, extract(year from d) as year
            from (select add_months(sysdate,1-level) as d from dual connect by level<50) )
select * from m;
/*
MONTH                  YEAR                   
---------------------- ---------------------- 
1                      2012                   
12                     2011                   
11                     2011                   
10                     2011           
...
...
*/

Run Code Online (Sandbox Code Playgroud)

然后该技术在select子句中使用子查询来检查 (month, year) 是否存在任何行 - 此子查询必须最多只能生成 1 行：

select min(num)
from foo
where month=m.month and year=m.year;

Run Code Online (Sandbox Code Playgroud)

这非常快，因为它利用了 PK 的有序性质——但是它需要每个月执行一次——如果每个月有数百万行是有意义的，但如果没有足够的行来适应就不行少量块。

归档时间：	13 年，8 月前
查看次数：	2253 次
最近记录：	13 年，8 月前