分区编号是否保证按值排序?

And*_*son 6 sql-server partitioning

假设我有一个分区表,设置如下:

CREATE PARTITION FUNCTION PF_Month (DATE) AS RANGE RIGHT FOR VALUES (
  '2017-01-01',
  '2017-02-01',
  '2017-03-01',
  '2017-04-01',
  '2017-05-01',
  '2017-06-01',
);
GO

CREATE PARTITION SCHEME PS_Month AS PARTITION PF_Month ALL TO ([Primary]);
GO

CREATE TABLE Logs
(
     Id           INT NOT NULL,
     DateRecorded DATE NOT NULL,
     FixStatus    INT NOT NULL
);
GO

ALTER TABLE Logs
    ADD CONSTRAINT PK_Logs PRIMARY KEY (Id, DateRecorded)
    ON PS_Month(DateRecorded);
GO

CREATE NONCLUSTERED INDEX [IX_DateRecorded] ON Logs(DateRecorded)
    INCLUDE(FixStatus)
    ON PS_Month(DateRecorded);
GO
Run Code Online (Sandbox Code Playgroud)

如果我想按日期顺序查询日志,我被告知我可以使用分区号以避免在每个分区的结果重新连接在一起时进行排序。

SELECT * FROM Logs WHERE ... ORDER BY $PARTITION.PF_Month(DateRecorded), DateRecorded
Run Code Online (Sandbox Code Playgroud)

分区编号是按创建每个分区的时间顺序排列的,还是按顺序排列DateRecorded?例如,如果我在函数使用时向该函数添加另一个拆分,按分区号排序是否仍然有效?

ALTER PARTITION FUNCTION PF_Month() SPLIT RANGE ('2016-12-01')
Run Code Online (Sandbox Code Playgroud)

Geo*_*son 5

我无法找到这种行为的正式保证,但我确实找到了多个示例——包括原始示例的修改版本——其中查询优化决策似乎是基于分区号按顺序排列的保证做出的价值。这是我发现的:

你的榜样

您的示例代码当前未编译,并且还使用未涵盖示例查询的索引。这是一个更新的脚本,其中包含原始示例的工作代码

通过这些更新,我们可以看到 ORDER BY 查询的查询计划不包含排序运算符;查询优化似乎依赖于分区号按值排序的事实,以便省略额外的排序。

-- Create a partition-aligned index that includes all
-- columns in your SELECT statements
CREATE NONCLUSTERED INDEX [IX_DateRecorded] ON Logs(DateRecorded)
     INCLUDE (FixStatus) ON PS_Month(DateRecorded)
GO

-- Now that your index is properly defined, this query produces
-- an ordered index scan without a sort
SELECT * FROM Logs ORDER BY DateRecorded
GO
Run Code Online (Sandbox Code Playgroud)

文档

文档中有强有力的证据表明分区编号是按值排序的,但我同意文档本身并不是决定性的。以下是我找到的一些证据:

创建分区功能文档指出分区最初是为了:

如果值不按顺序排列,数据库引擎会对它们进行排序,创建函数,并返回值未按顺序提供的警告。

所述sys.partition_range_values该文档状态boundary_id,这似乎可以用来定义分区号,是一种

边界值元组的 ID(基于 1 的序数),最左边的边界从 ID 1 开始。

$ PARTITION文档的统计信息

$PARTITION 返回一个介于 1 和分区函数的分区数之间的 int 值。

进一步测试

通过一些测试,我们可以看到在多种情况下,SQL Server 做出的查询计划决策似乎依赖于分区号按值排序这一事实。

这是完整的测试脚本,以下是一些最相关的查询:

-- The predicate x BETWEEN 4 AND 7 is converted to a range seek
-- on the partition number, strongly indicating that there is a
-- guarantee that partition numbers are in order by value.
-- Seek Keys[1]:
--      Start: PtnId1001 >= Scalar Operator(...[@1]...),
--      End: PtnId1001 <= Scalar Operator(...[@2]...)
SELECT COUNT(*)
FROM dbo.test_partition_ranges
WHERE x BETWEEN 4 AND 7
GO

-- If we change our predicate to use the partition number directly,
-- we see the same seek predicate.
-- Seek Keys[1]:
--      Start: PtnId1001 >= Scalar Operator(...[@1]...),
--      End: PtnId1001 <= Scalar Operator(...[@2]...)
SELECT COUNT(*)
FROM dbo.test_partition_ranges
WHERE $PARTITION.PF_INT_1to10(x) BETWEEN 4 AND 7
GO

-- If we mix conflicting predicates that use values and partition numbers,
-- SQL Server knows that no partitions are eligible.
-- "Actual Partition Count" is 0 and there are 0 logical reads
SELECT COUNT(*)
FROM dbo.test_partition_ranges
WHERE $PARTITION.PF_INT_1to10(x) > 5
    AND x <= 5
GO

-- All 10 partitions are accessed, and ordering by the partition column
-- does not produce a sort! It appears that SQL Server is once again
-- relying on the partition numbers being in order by value.
SELECT x
FROM dbo.test_partition_ranges
ORDER BY x
GO
Run Code Online (Sandbox Code Playgroud)