bigint 表的存储大小

Joh*_*ret 4 sql-server storage sql-server-2008-r2 database-internals

我正在使用具有以下格式的表格:

CREATE TABLE dbo.ID_STORE
( WORKING_ID bigint PRIMARY KEY CLUSTERED )
Run Code Online (Sandbox Code Playgroud)

该表存储了大约 200 万行,但存储的 id 不是连续的,MAX(WORKING_ID)-MIN(WORKING_ID) 大约是 24百万。

当我查看已用空间时,我发现大约 57 兆字节,而我预期略高于 2 10 ^ 6 x 8 = 16 兆字节。谁能解释一下区别?

编辑:这些数字是从第一次导入到所述表中获得的。该表在填充之前也会被截断。

Pau*_*ite 14

当以 FixedVar 格式(默认值)存储时,每行至少有 7 个字节的开销。还会有(通常相对较少)数量的页面用于聚集索引的上层。最佳存储,不考虑上层索引级别,200 万行只需要:

(7 + 8 bytes) * 2,000,000 = 28.61MB.
Run Code Online (Sandbox Code Playgroud)

更重要的是,页面可能会分裂(除非数据是按集群键顺序加载的),因此当前页面可能不会满 100%。拆分页面时,为了按键顺序容纳新行,大约 50% 的现有行被移动到新页面,从而降低了平均密度。此外,如果删除行的整个页面变为空,则任何删除的行只会导致空间被回收。此外,每个 8KB 数据页都有一个 96 字节的标头,页上每行 2 个字节用于行偏移数组。

以下示例加载 2,000,000 行,其分布与您的数据大致相同,并尽可能进行压缩:

CREATE TABLE dbo.ID_STORE
( 
    WORKING_ID bigint NOT NULL PRIMARY KEY CLUSTERED
);

WITH
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.ID_STORE WITH (TABLOCKX)
SELECT Nums.n * 12
FROM Nums
WHERE Nums.n <= 2000000;
Run Code Online (Sandbox Code Playgroud)

输出:

EXECUTE sys.sp_spaceused 
    @objname = N'dbo.ID_STORE',
    @updateusage = 'true';
Run Code Online (Sandbox Code Playgroud)

...显示为该对象保留的 33,800 KB 空间。

截断并加载,因此发生页面拆分:

TRUNCATE TABLE dbo.ID_STORE;

WITH
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.ID_STORE WITH (TABLOCKX)
SELECT Nums.n * 12
FROM Nums
WHERE Nums.n <= 1000000;

WITH
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.ID_STORE WITH (TABLOCKX)
SELECT Nums.n * 12 + 1
FROM Nums
WHERE Nums.n <= 1000000;
Run Code Online (Sandbox Code Playgroud)

输出:

EXECUTE sys.sp_spaceused 
    @objname = N'dbo.ID_STORE',
    @updateusage = 'true';
Run Code Online (Sandbox Code Playgroud)

...现在显示保留了 50,632 KB KB 空间。

重建聚集索引:

ALTER INDEX ALL 
ON dbo.ID_STORE
REBUILD 
WITH 
(
    MAXDOP = 1, 
    SORT_IN_TEMPDB = ON
);
Run Code Online (Sandbox Code Playgroud)

...再次将保留的空间减少到 33,800 KB。

根据您拥有的 SQL Server 版本和版本,可以使用行或页压缩、聚集列存储或聚集列存储归档存储更紧凑地存储此表。在后一种情况下(需要 SQL Server 2014 Enterprise),2,000,000 行仅保留 2,960 KB 的空间。