Mar*_*ith 22 sql sql-server sql-server-2008
我发现在使用公共子表达式假脱机的执行计划中,报告的逻辑读取对于大型表来说非常高.
经过一些试验和错误后,我发现了一个似乎适用于下面的测试脚本和执行计划的公式. Worktable logical reads = 1 + NumberOfRows * 2 + NumberOfGroups * 4
我不明白为什么这个公式成立的原因.这比我想象的更有必要看一下这个计划.任何人都可以通过吹嘘这个帐户的内容来打击这个吗?
或者失败那是否有任何方法来跟踪每个逻辑读取中读取的页面,以便我可以自己解决?
SET STATISTICS IO OFF; SET NOCOUNT ON;
IF Object_id('tempdb..#Orders') IS NOT NULL
DROP TABLE #Orders;
CREATE TABLE #Orders
(
OrderID INT IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED,
CustomerID NCHAR(5) NULL,
Freight MONEY NULL,
);
CREATE NONCLUSTERED INDEX ix
ON #Orders (CustomerID)
INCLUDE (Freight);
INSERT INTO #Orders
VALUES (N'ALFKI', 29.46),
(N'ALFKI', 61.02),
(N'ALFKI', 23.94),
(N'ANATR', 39.92),
(N'ANTON', 22.00);
SELECT PredictedWorktableLogicalReads =
1 + 2 * Count(*) + 4 * Count(DISTINCT CustomerID)
FROM #Orders;
SET STATISTICS IO ON;
SELECT OrderID,
Freight,
Avg(Freight) OVER (PARTITION BY CustomerID) AS Avg_Freight
FROM #Orders;
Run Code Online (Sandbox Code Playgroud)
产量
PredictedWorktableLogicalReads
------------------------------
23
Run Code Online (Sandbox Code Playgroud)
Table 'Worktable'. Scan count 3, logical reads 23, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Orders___________000000000002'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Run Code Online (Sandbox Code Playgroud)
附加信息:
查询调优和优化手册的第3章以及Paul White撰写的这篇博客文章对这些假脱机有很好的解释.
总之,计划顶部的段迭代器为它发送的行添加一个标志,指示它何时是新分区的开始.主段假脱机从段迭代器一次获取一行,并将其插入tempdb中的工作表.一旦获得标志说新组已经启动它就会返回一行到嵌套循环运算符的顶部输入.这会导致在工作表中的行上调用流聚合,计算平均值,然后在工作表被截断为新组准备好之前,将该值与工作表中的行连接起来.分段假脱机发出一个虚拟行,以便处理最后一组.
据我所知,工作表是一个堆(或者它将在计划中表示为索引假脱机).但是,当我尝试复制相同的进程时,它只需要11次逻辑读取.
CREATE TABLE #WorkTable
(
OrderID INT,
CustomerID NCHAR(5) NULL,
Freight MONEY NULL,
)
DECLARE @Average MONEY
PRINT 'Insert 3 Rows'
INSERT INTO #WorkTable
VALUES (1, N'ALFKI', 29.46) /*Scan count 0, logical reads 1*/
INSERT INTO #WorkTable
VALUES (2, N'ALFKI', 61.02) /*Scan count 0, logical reads 1*/
INSERT INTO #WorkTable
VALUES (3, N'ALFKI', 23.94) /*Scan count 0, logical reads 1*/
PRINT 'Calculate AVG'
SELECT @Average = Avg(Freight)
FROM #WorkTable /*Scan count 1, logical reads 1*/
PRINT 'Return Rows - With the average column included'
/*This convoluted query is just to force a nested loops plan*/
SELECT *
FROM (SELECT @Average AS Avg_Freight) T /*Scan count 1, logical reads 1*/
OUTER APPLY #WorkTable
WHERE COALESCE(Freight, OrderID) IS NOT NULL
AND @Average IS NOT NULL
PRINT 'Clear out work table'
TRUNCATE TABLE #WorkTable
PRINT 'Insert 1 Row'
INSERT INTO #WorkTable
VALUES (4, N'ANATR', 39.92) /*Scan count 0, logical reads 1*/
PRINT 'Calculate AVG'
SELECT @Average = Avg(Freight)
FROM #WorkTable /*Scan count 1, logical reads 1*/
PRINT 'Return Rows - With the average column included'
SELECT *
FROM (SELECT @Average AS Avg_Freight) T /*Scan count 1, logical reads 1*/
OUTER APPLY #WorkTable
WHERE COALESCE(Freight, OrderID) IS NOT NULL
AND @Average IS NOT NULL
PRINT 'Clear out work table'
TRUNCATE TABLE #WorkTable
PRINT 'Insert 1 Row'
INSERT INTO #WorkTable
VALUES (5, N'ANTON', 22.00) /*Scan count 0, logical reads 1*/
PRINT 'Calculate AVG'
SELECT @Average = Avg(Freight)
FROM #WorkTable /*Scan count 1, logical reads 1*/
PRINT 'Return Rows - With the average column included'
SELECT *
FROM (SELECT @Average AS Avg_Freight) T /*Scan count 1, logical reads 1*/
OUTER APPLY #WorkTable
WHERE COALESCE(Freight, OrderID) IS NOT NULL
AND @Average IS NOT NULL
PRINT 'Clear out work table'
TRUNCATE TABLE #WorkTable
PRINT 'Calculate AVG'
SELECT @Average = Avg(Freight)
FROM #WorkTable /*Scan count 1, logical reads 0*/
PRINT 'Return Rows - With the average column included'
SELECT *
FROM (SELECT @Average AS Avg_Freight) T
OUTER APPLY #WorkTable
WHERE COALESCE(Freight, OrderID) IS NOT NULL
AND @Average IS NOT NULL
DROP TABLE #WorkTable
Run Code Online (Sandbox Code Playgroud)
小智 21
对于工作表,逻辑读取的计数方式不同:每行读取一次"逻辑读取" .这并不意味着工作表在某种程度上比"真正的"假脱机表效率低(完全相反); 逻辑读取只是在不同的单元中.
我相信这样的想法是,计算工作表逻辑读取的哈希页面不会非常有用,因为这些结构是服务器的内部结构.在逻辑读取计数器中假脱机的报告行使得该数字对于分析目的更有意义.
这种见解应该成为您的公式清晰的原因.两个辅助线轴完全读取两次(2*COUNT(*)),主卷轴发出(组值数+ 1)行,如我的博客条目中所述,给出(COUNT(DISTINCT CustomerID)+ 1)组件.加号是主线轴发出的额外行,表示最后一组已经结束.
保罗
| 归档时间: |
|
| 查看次数: |
2584 次 |
| 最近记录: |