Hen*_*sen 21 performance sql-server partitioning query-performance
我在 PCMag 中阅读了 Itzik Ben-Gan 的这些文章:
搜索,您应扫描第 I 部分:当优化器未优化
搜索时,您应扫描第 II 部分:升序键
我目前的所有分区表都存在“最大分组”问题。我们使用Itzik Ben-Gan 提供的技巧来获取 max(ID),但有时它不会运行:
DECLARE @MaxIDPartitionTable BIGINT
SELECT @MaxIDPartitionTable = ISNULL(MAX(IDPartitionedTable), 0)
FROM ( SELECT *
FROM ( SELECT partition_number PartitionNumber
FROM sys.partitions
WHERE object_id = OBJECT_ID('fct.MyTable')
AND index_id = 1
) T1
CROSS APPLY ( SELECT ISNULL(MAX(UpdatedID), 0) AS IDPartitionedTable
FROM fct.MyTable s
WHERE $PARTITION.PF_MyTable(s.PCTimeStamp) = PartitionNumber
AND UpdatedID <= @IDColumnThresholdValue
) AS o
) AS T2;
SELECT @MaxIDPartitionTable
Run Code Online (Sandbox Code Playgroud)
我得到这个计划
但是 45 分钟后,看看读数
reads writes physical_reads
12,949,127 2 12,992,610
Run Code Online (Sandbox Code Playgroud)
我得到了sp_whoisactive
。
通常它运行得很快,但今天不是。
编辑:带分区的表结构:
CREATE PARTITION FUNCTION [MonthlySmallDateTime](SmallDateTime) AS RANGE RIGHT FOR VALUES (N'2000-01-01T00:00:00.000', N'2000-02-01T00:00:00.000' /* and many more */)
go
CREATE PARTITION SCHEME PS_FctContractualAvailability AS PARTITION [MonthlySmallDateTime] TO ([Standard], [Standard])
GO
CREATE TABLE fct.MyTable(
MyTableID BIGINT IDENTITY(1,1),
[DT1TurbineID] INT NOT NULL,
[PCTimeStamp] SMALLDATETIME NOT NULL,
Filler CHAR(100) NOT NULL DEFAULT 'N/A',
UpdatedID BIGINT NULL,
UpdatedDate DATETIME NULL
CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED
(
[DT1TurbineID] ASC,
[PCTimeStamp] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = ROW) ON [PS_FctContractualAvailability]([PCTimeStamp])
) ON [PS_FctContractualAvailability]([PCTimeStamp])
GO
CREATE UNIQUE NONCLUSTERED INDEX [IX_UpdatedID_PCTimeStamp] ON [fct].MyTable
(
[UpdatedID] ASC,
[PCTimeStamp] ASC
)
INCLUDE ( [UpdatedDate])
WHERE ([UpdatedID] IS NOT NULL)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = ROW) ON [PS_FctContractualAvailability]([PCTimeStamp])
GO
Run Code Online (Sandbox Code Playgroud)
Pau*_*ite 28
基本问题是 Index Seek 后面没有 Top 操作符。这是一种优化,通常在搜索以正确的MIN\MAX
聚合顺序返回行时引入。
这种优化利用了最小/最大行是按升序或降序排列的第一个这一事实。也可能是优化器无法将此优化应用于分区表;我忘了。
无论如何,关键是如果没有这种转换,执行计划最终会处理每个S.UpdatedID <= @IDColumnThresholdValue
分区符合条件的每一行,而不是每个分区所需的一行。
您没有在问题中提供表、索引或分区定义,所以我不能更具体。您应该检查您的索引是否支持这种转换。或多或少等效地,您也可以将 the 表示MAX
为 a TOP (1) ... ORDER BY UpdatedID DESC
。
如果这导致 Sort (包括TopN Sort),则您知道您的索引没有帮助。例如:
SELECT
@MaxIDPartitionTable = ISNULL(MAX(T2.IDPartitionedTable), 0)
FROM
(
SELECT
O.IDPartitionedTable
FROM
(
SELECT
P.partition_number AS PartitionNumber
FROM sys.partitions AS P
WHERE
P.[object_id] = OBJECT_ID(N'fct.MyTable', N'U')
AND P.index_id = 1
) AS T1
CROSS APPLY
(
SELECT TOP (1)
S.UpdatedID AS IDPartitionedTable
FROM fct.MyTable AS S
WHERE
$PARTITION.PF_MyTable(S.PCTimeStamp) = T1.PartitionNumber
AND S.UpdatedID <= @IDColumnThresholdValue
ORDER BY
S.UpdatedID DESC
) AS O
) AS T2;
Run Code Online (Sandbox Code Playgroud)
这应该产生的计划形状是:
注意索引搜索下方的顶部。这将处理限制为每个分区一行。
或者,使用临时表来保存分区号:
CREATE TABLE #Partitions
(
partition_number integer PRIMARY KEY CLUSTERED
);
INSERT #Partitions
(partition_number)
SELECT
P.partition_number AS PartitionNumber
FROM sys.partitions AS P
WHERE
P.[object_id] = OBJECT_ID(N'fct.MyTable', N'U')
AND P.index_id = 1;
SELECT
@MaxIDPartitionTable = ISNULL(MAX(T2.UpdatedID), 0)
FROM #Partitions AS P
CROSS APPLY
(
SELECT TOP (1)
S.UpdatedID
FROM fct.MyTable AS S
WHERE
$PARTITION.PF_MyTable(S.PCTimeStamp) = P.partition_number
AND S.UpdatedID <= @IDColumnThresholdValue
ORDER BY
S.UpdatedID DESC
) AS T2;
DROP TABLE #Partitions;
Run Code Online (Sandbox Code Playgroud)
旁注:访问查询中的系统表可防止并行。如果这很重要,请考虑在临时表中具体化分区号,然后APPLY
从那里开始。并行在这种模式中通常没有帮助(具有正确的索引),但我不提它会失职。
旁注 2:有一个活动的 Connect 项要求MIN\MAX
对分区对象上的聚合和 Top提供内置支持。