SQL Server 是否使用指针而不是存储重复的行?

Dan*_*ell 8 sql-server

sp_spaceused在我们的软件中执行操作之前和之后使用内置存储过程来查看哪些表具有行插入以及每个表的大小如何变化。

我所看到的是,在所有将行写入其中的表中,只有少数显示表的侧面增加。其他显示行已添加的显示与此存储过程的大小没有变化。

唯一不正确的情况是在对所有表执行截断后的第一个事务上。所以对我来说,似乎 SQL Server 在存储重复数据时显示插入了行,但必须只存储指向以前相同行的指针。

任何人都可以确认这一点吗?

gbn*_*gbn 13

否,SQL Server 不检测重复行

SQL Server 正在填充已分配页中的空页或部分空页。

因此,如果我有一个非常窄的行(比如 2 列),我可以在同一页面上再添加几百行而不增加使用的空间。

快速而肮脏的演示(没有重复的行,但如果你愿意,你可以玩这个)

IF OBJECT_ID('dbo.Demo') IS NOT NULL
    DROP TABLE dbo.Demo;
GO
CREATE TABLE dbo.Demo (DemoID int NOT NULL IDENTITY(1,1), Demo char(1) NOT NULL)
GO
SELECT 'zero rows, zero space', SUM(ps.reserved_page_count)/128.0 AS ReservedMB, SUM(ps.used_page_count)/128.0 AS UsedMB
FROM sys.dm_db_partition_stats ps
WHERE ps.object_id = OBJECT_ID('dbo.Demo')
GO

INSERT dbo.Demo VALUES ('a');
GO
SELECT 'one row. Peanuts', SUM(ps.reserved_page_count)/128.0 AS ReservedMB, SUM(ps.used_page_count)/128.0 AS UsedMB
FROM sys.dm_db_partition_stats ps
WHERE ps.object_id = OBJECT_ID('dbo.Demo')
GO

INSERT dbo.Demo VALUES ('b');
GO 100
SELECT '101 rows. All on one page', SUM(ps.reserved_page_count)/128.0 AS ReservedMB, SUM(ps.used_page_count)/128.0 AS UsedMB
FROM sys.dm_db_partition_stats ps
WHERE ps.object_id = OBJECT_ID('dbo.Demo')
GO

INSERT dbo.Demo VALUES ('b');
GO 1899
SELECT '2000 rows. More than one page', SUM(ps.reserved_page_count)/128.0 AS ReservedMB, SUM(ps.used_page_count)/128.0 AS UsedMB
FROM sys.dm_db_partition_stats ps
WHERE ps.object_id = OBJECT_ID('dbo.Demo')
GO

TRUNCATE TABLE dbo.Demo
GO
SELECT 'zero rows, zero space. TRUNCATE deallocates pages', SUM(ps.reserved_page_count)/128.0 AS ReservedMB, SUM(ps.used_page_count)/128.0 AS UsedMB
FROM sys.dm_db_partition_stats ps
WHERE ps.object_id = OBJECT_ID('dbo.Demo')
GO

INSERT dbo.Demo VALUES ('c');
GO 500
SELECT '500 rows. Some space used', SUM(ps.reserved_page_count)/128.0 AS ReservedMB, SUM(ps.used_page_count)/128.0 AS UsedMB
FROM sys.dm_db_partition_stats ps
WHERE ps.object_id = OBJECT_ID('dbo.Demo')
GO

DELETE dbo.Demo
GO
SELECT 'zero rows after delete. Space still allocated', SUM(ps.reserved_page_count)/128.0 AS ReservedMB, SUM(ps.used_page_count)/128.0 AS UsedMB
FROM sys.dm_db_partition_stats ps
WHERE ps.object_id = OBJECT_ID('dbo.Demo')
GO

IF OBJECT_ID('dbo.Demo') IS NOT NULL
    DROP TABLE dbo.Demo;
GO
Run Code Online (Sandbox Code Playgroud)


小智 7

SQL Server 是否使用指针而不是存储重复的行?

这取决于 SQL Server 的版本和数据压缩选项:

  • 从 SQL Server 2008 开始,在行或页级别有一个压缩选项。
  • 页面级压缩使用许多算法/技术进行压缩。关于您的问题(重复数据的指针),页面压缩使用(也) 前缀压缩和字典压缩

前缀压缩[...] 列中重复的前缀值被替换为对相应前缀的引用 [...]

字典压缩前缀压缩完成后,应用字典压缩。字典压缩搜索页面上任意位置的重复值,并将它们存储在 CI 区域中。与前缀压缩不同,字典压缩不限于一列。字典压缩可以替换页面上任何位置出现的重复值。下图显示了字典压缩后的同一页面。

因此,对于前缀和字典压缩(页面压缩),SQL Server 使用指针在同一列或差异中存储 (部分或完全)重复(非重复行)。列。

CREATE DATABASE TestComp;
GO

USE TestComp;
GO

CREATE TABLE Person1 (
    PersonID INT IDENTITY PRIMARY KEY,
    FirstName NVARCHAR(100) NOT NULL,
    LastName NVARCHAR(100) NOT NULL
);
GO

DECLARE 
    @f NVARCHAR(100) = REPLICATE('A',100), 
    @l NVARCHAR(100) = REPLICATE('B',100);

INSERT Person1 (FirstName, LastName)
VALUES (@f, @l);
GO 1000

CREATE TABLE Person2 (
    PersonID INT IDENTITY PRIMARY KEY,
    FirstName NVARCHAR(100) NOT NULL,
    LastName NVARCHAR(100) NOT NULL
);
GO

ALTER TABLE Person2
REBUILD
WITH (DATA_COMPRESSION=PAGE);
GO

DECLARE 
    @f NVARCHAR(100) = REPLICATE('A',100), 
    @l NVARCHAR(100) = REPLICATE('B',100);

INSERT Person2 (FirstName, LastName)
VALUES (@f, @l);
GO 1000

SELECT  f.page_count AS PageCount_Person1_Uncompressed
FROM    sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('Person1'), 1, DEFAULT, DEFAULT) f
SELECT  f.page_count AS PageCount_Person2_Compressed
FROM    sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('Person2'), 1, DEFAULT, DEFAULT) f
GO
Run Code Online (Sandbox Code Playgroud)

结果:

PageCount_Person1_Uncompressed
------------------------------
53

PageCount_Person2_Compressed
----------------------------
2
Run Code Online (Sandbox Code Playgroud)