Joh*_*ret 4 sql-server storage sql-server-2008-r2 database-internals
我正在使用具有以下格式的表格:
CREATE TABLE dbo.ID_STORE
( WORKING_ID bigint PRIMARY KEY CLUSTERED )
Run Code Online (Sandbox Code Playgroud)
该表存储了大约 200 万行,但存储的 id 不是连续的,MAX(WORKING_ID)-MIN(WORKING_ID) 大约是 24百万。
当我查看已用空间时,我发现大约 57 兆字节,而我预期略高于 2 10 ^ 6 x 8 = 16 兆字节。谁能解释一下区别?
编辑:这些数字是从第一次导入到所述表中获得的。该表在填充之前也会被截断。
Pau*_*ite 14
当以 FixedVar 格式(默认值)存储时,每行至少有 7 个字节的开销。还会有(通常相对较少)数量的页面用于聚集索引的上层。最佳存储,不考虑上层索引级别,200 万行只需要:
(7 + 8 bytes) * 2,000,000 = 28.61MB.
Run Code Online (Sandbox Code Playgroud)
更重要的是,页面可能会分裂(除非数据是按集群键顺序加载的),因此当前页面可能不会满 100%。拆分页面时,为了按键顺序容纳新行,大约 50% 的现有行被移动到新页面,从而降低了平均密度。此外,如果删除行的整个页面变为空,则任何删除的行只会导致空间被回收。此外,每个 8KB 数据页都有一个 96 字节的标头,页上每行 2 个字节用于行偏移数组。
以下示例加载 2,000,000 行,其分布与您的数据大致相同,并尽可能进行压缩:
CREATE TABLE dbo.ID_STORE
(
WORKING_ID bigint NOT NULL PRIMARY KEY CLUSTERED
);
WITH
L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.ID_STORE WITH (TABLOCKX)
SELECT Nums.n * 12
FROM Nums
WHERE Nums.n <= 2000000;
Run Code Online (Sandbox Code Playgroud)
输出:
EXECUTE sys.sp_spaceused
@objname = N'dbo.ID_STORE',
@updateusage = 'true';
Run Code Online (Sandbox Code Playgroud)
...显示为该对象保留的 33,800 KB 空间。
截断并加载,因此发生页面拆分:
TRUNCATE TABLE dbo.ID_STORE;
WITH
L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.ID_STORE WITH (TABLOCKX)
SELECT Nums.n * 12
FROM Nums
WHERE Nums.n <= 1000000;
WITH
L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.ID_STORE WITH (TABLOCKX)
SELECT Nums.n * 12 + 1
FROM Nums
WHERE Nums.n <= 1000000;
Run Code Online (Sandbox Code Playgroud)
输出:
EXECUTE sys.sp_spaceused
@objname = N'dbo.ID_STORE',
@updateusage = 'true';
Run Code Online (Sandbox Code Playgroud)
...现在显示保留了 50,632 KB KB 空间。
重建聚集索引:
ALTER INDEX ALL
ON dbo.ID_STORE
REBUILD
WITH
(
MAXDOP = 1,
SORT_IN_TEMPDB = ON
);
Run Code Online (Sandbox Code Playgroud)
...再次将保留的空间减少到 33,800 KB。
根据您拥有的 SQL Server 版本和版本,可以使用行或页压缩、聚集列存储或聚集列存储归档存储更紧凑地存储此表。在后一种情况下(需要 SQL Server 2014 Enterprise),2,000,000 行仅保留 2,960 KB 的空间。
归档时间: |
|
查看次数: |
961 次 |
最近记录: |