聚集索引上的列存储索引（在声明主键时创建） - SQL Server

Question

聚集索引上的列存储索引（在声明主键时创建） - SQL Server

Sil*_*SCP 5 sql-server clustered-index sql-server-2012 columnstore sql-server-2014

创建聚集索引时，表本身成为按索引键排序的索引结构。

想象一下，现在我们在表 XPTO (a,b,c,d,e) 中有 5 列，“a”是主键，我们在表 XPTO 上创建一个列存储索引，其中包含列 (b,c)。

这个索引的结构是什么？与聚簇表相比有不同的结构吗？或者列存储是否有指向聚集表的指针（如其他非聚集索引）。

最后，同样的场景但是创建了所有属性的列索引，结构是什么？

Answer 1

Geo*_*son 4

通过创建测试脚本我们可以学到很多东西！您应该能够在任何 SQL 2014 实例上运行下面的 SQL（也可能是 SQL 2012，但我没有在那里进行测试）。

从该脚本中，我们可以看到 (b, c) 上的非聚集列存储索引确实存储了聚集索引列 (a) 的数据，并且它以与存储 b 和 c 相同的方式（通过段）存储数据。这允许列存储有效地处理访问 a、b 和 c 的查询。从技术上讲，它甚至允许将列存储与键查找结合使用，以处理需要表中所有列的查询（尽管考虑到优化器的成本，这不太可能受到优化器的青睐）键查找）。

就您有关列存储索引“结构”的问题而言，我认为这个问题太深奥，无法在这里回答。但如果您有兴趣了解更多信息，我认为 Niko Neugebauer 出色的 55 部分（且不断增加）的列存储系列包含大量有关结构和内部的有价值的信息：http ://www.nikoport.com/columnstore/

-- Create a table to your specs
CREATE TABLE dbo.XPTO (a BIGINT NOT NULL, b INT NOT NULL, c INT NOT NULL, d INT NOT NULL, e INT NOT NULL, CONSTRAINT PK_XPTO PRIMARY KEY (a))
GO

-- Insert some trivial dummy data
INSERT INTO dbo.XPTO (a, b, c, d, e)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)), v2.x, v3.x, v4.x, v5.x
FROM ( VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10) ) v(x)
CROSS JOIN ( VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10) ) v2(x)
CROSS JOIN ( VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10) ) v3(x)
CROSS JOIN ( VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10) ) v4(x)
CROSS JOIN ( VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10) ) v5(x)
CROSS JOIN ( VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10) ) v6(x)
GO

-- Create the nonclustered columnstore
CREATE NONCLUSTERED COLUMNSTORE INDEX cs_XPTO ON dbo.XPTO (b, c)
GO

-- Turn on "Include Actual Execution Plan"

-- Access only the columns specified in the columnstore index
-- Here we get a Columnstore Index Scan, as expected
SELECT SUM(b), SUM(c)
FROM XPTO
GO

-- Add in the clustered index column
-- Now we also get a Columnstore Index Scan!
-- This shows that the non-clustered columnstore does maintain data for "a"
-- because it is able to process this query without touching the clustered index
SELECT SUM(a), SUM(b), SUM(c)
FROM XPTO
GO

-- Access all columns
-- Now we get a Clustered Index Scan, and the columnstore index is not used
SELECT SUM(a), SUM(b), SUM(c), SUM(d), SUM(e)
FROM XPTO
GO

-- Try to force usage of columnstore
-- Now we see a tplan that uses the columnstore index but performs a key looked up to the clustered index
-- This is a plan shape I have never seen the query optmizer generate on its own, but it makes sense
-- that it is possible given that the columnstore index stores data for the clustered index column,
-- which is essentially a pointer to the corresponding row in the table
SELECT SUM(a), SUM(b), SUM(c), SUM(d), SUM(e)
FROM XPTO WITH(INDEX(cs_XPTO))
GO

-- View the column store segments
-- Here we see that there is a column store segment for columns a, b, and c (even though we asked for just b and c!)
-- This seems to indicate that clustered index columns are implicitly added to any nonclustered columnstore index
-- and are stored within that columnstore index in the same way as the requested columns
SELECT cs.*
FROM sys.partitions p
JOIN sys.column_store_segments cs
    ON cs.partition_id = p.partition_id
WHERE p.object_id = OBJECT_ID('XPTO')
GO

-- Cleanup
DROP TABLE dbo.XPTO
GO

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年前
查看次数：	801 次
最近记录：	11 年前