为什么MAX在索引视图上的表现比TOP差得多?

Pau*_*lin 6 sql t-sql sql-server-2008

我发现,在具有适当索引的索引视图上,MAX(日期)执行整个索引扫描,然后执行流聚合,而TOP(1)日期最佳地使用索引并且仅扫描单个行.对于大量行,这会导致严重的性能问题.我已经包含了一些代码来演示下面的问题,但有兴趣知道其他人是否可以解释为什么会发生这种行为(它不会出现在具有类似索引的表上)以及它是否是SQL Server优化器中的错误(I已经在2008 SP2和R2上进行了测试,两者都显示了相同的问题).

CREATE TABLE dbo.TableWithDate
(
  id INT IDENTITY(1,1) PRIMARY KEY,
  theDate DATE NOT NULL
);

CREATE NONCLUSTERED INDEX [ix_date] ON dbo.TableWithDate([theDate] DESC);

INSERT INTO dbo.TableWithDate(theDate) VALUES('1 MAR 2010'),('1 MAR 2010'), ('3 JUN 2008');

-- Test 1:  max vs top(1) on the table.  They give same optimal plan (scan one row from the index, since index is in order)
SELECT TOP(1) theDate FROM dbo.TableWithDate ORDER BY theDate DESC;
SELECT MAX(theDate) FROM dbo.TableWithDate;

CREATE TABLE dbo.TheJoinTable
(
  identId INT IDENTITY(1,1) PRIMARY KEY,
  foreignId INT NOT NULL,
  someValue INT NOT NULL
);

CREATE NONCLUSTERED INDEX [ix_foreignValue] ON dbo.TheJoinTable([foreignId] ASC);

INSERT INTO dbo.TheJoinTable(foreignId,someValue) VALUES (1,10),(1,20),(1,30),(2,5),(3,6),(3,10);

GO

CREATE VIEW dbo.TheTablesJoined 
WITH SCHEMABINDING
AS 
  SELECT T2.identId, T1.id, T1.theDate, T2.someValue
  FROM dbo.TableWithDate AS T1
  INNER JOIN dbo.TheJoinTable AS T2 ON T2.foreignId=T1.id
GO

-- Notice the different plans:  the TOP one does a scan of 1 row from each and joins
-- The max one does a scan of the entire index and then does seek operations for each item (less efficient)
SELECT TOP(1) theDate FROM dbo.TheTablesJoined ORDER BY theDate DESC;

SELECT MAX(theDate) FROM dbo.TheTablesJoined;

-- But what about if we put an index on the view?  Does that make a difference?
CREATE UNIQUE CLUSTERED INDEX [ix_clust1] ON dbo.TheTablesJoined([identId] ASC);
CREATE NONCLUSTERED INDEX [ix_dateDesc] ON dbo.TheTablesJoined ([theDate] DESC);

-- No!!!! We are still scanning the entire index (look at the actual number of rows) in the MAX case.
SELECT TOP(1) theDate FROM dbo.TheTablesJoined ORDER BY theDate DESC;

SELECT MAX(theDate) FROM dbo.TheTablesJoined;
Run Code Online (Sandbox Code Playgroud)

ber*_*d_k 1

要评估任何聚合函数(如 max)的值,必须读取表中的所有行,因为评估中会使用其值之一。Top 1 只需要读取一行,当没有 order by 强制并且没有合适的索引来扫描整个表时,这可以非常快地完成。在这些情况下,您可以创建合适的索引来提高性能。

  • RPM1984:优化器知道“ORDER BY”子句与索引的顺序匹配,因此它知道它不必排序,这意味着它不必读取所有行。 (4认同)