Som*_*123 9 performance index xml sql-server execution-plan query-performance
在以下情况下,我没有得到索引查找。而不是<Number>xxx</Number>像这篇文章那样插入到 xml 列中为什么当 where 子句过滤 `value()` 时不使用二级选择性索引?, 用这个 xml 插入 100k 行<SomeText>NiceText</SomeText>和类似数量的 this <SomeText>MoreText</SomeText>。不需要是100k。只需要很多。然后添加索引
create selective xml index SIX_T on dbo.T(XMLDoc) for
(
pathXQUERY = '/SomeText' as xquery 'xs:string' maxlength(8) singleton
);
Run Code Online (Sandbox Code Playgroud)
和二级索引
create xml index SIX_T_pathXQUERY on dbo.T(XMLDoc)
using xml index SIX_T for (pathXQUERY);
Run Code Online (Sandbox Code Playgroud)
然后数数
select count(*)
from dbo.T as T
where T.XMLDoc.exist('/SomeText[. eq "MoreText"]') = 1;
Run Code Online (Sandbox Code Playgroud)
请注意,它不使用索引查找并且“慢”。数百万行可能需要几秒钟。如果我在标准列中插入相同的值并向其添加索引并执行
select count(id)
from dbo.T as T where SomeTextColumn = 'MoreText'
Run Code Online (Sandbox Code Playgroud)
我立即得到结果。在 sql server 18.3.1 上完成的所有测试
问题是,我怎样才能让 xml 计数和按列计数一样快?
谢谢
数据差异
由于要查找的记录数量很少,优化器可以使用SIX_T_pathXQUERY索引:
并moretext使用搜索谓词过滤:
这里有趣的部分是它执行键查找以获取不为空的 path_1_id 值。
由于这是对非聚集 xml 索引的过滤器定义...
...虽然不存在于索引本身中。
因此,对于考虑使用索引的优化器,它知道必须完成以下步骤:
SIX_T内表上的聚集索引相匹配并过滤,path_1_id is not null因为path_1_id不包含在二级索引中dbo.T以返回要依赖的 ID引爆点
当要返回的预期行数更高时,它倾向于使用选择性聚集 XML 索引,以便能够运行合并连接和无键查找:
带有残差谓词:
过滤 xml 列和 path_1_id
比较计划
您可以(但不应该)使用USE PLAN提示来强制执行 XML 列上的查找的计划,看看如果我们在哪里搜索这些值会发生什么。
随着执行时间 =
CPU time = 218 ms, elapsed time = 215 ms.
Run Code Online (Sandbox Code Playgroud)
以及扫描+合并连接计划的执行时间:
CPU time = 62 ms, elapsed time = 58 ms.
Run Code Online (Sandbox Code Playgroud)
简而言之,我相信使用残差谓词扫描 + 合并连接选择是优化器的正确选择。
无路可走
虽然我可能是错的,但我认为没有办法使用常规 XML 索引改进计数查询。我们也无法更改这些内部表,甚至无法查询它们:
SELECT * FROM
[sys].[xml_sxi_table_1463676262_256000];
Run Code Online (Sandbox Code Playgroud)
它甚至在合并连接查询计划中提供了一个缺失的索引提示,您无法创建它:
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [sys].[xml_sxi_table_1463676262_256000] ([pathXQUERY_1_value])
INCLUDE ([path_1_id])
Run Code Online (Sandbox Code Playgroud)
要改进查询,您必须研究非 xml 索引解决方案。
编辑
如何使用 USE PLAN 提示?您有此查询的示例代码吗?
我会建议不要这样做,但出于测试目的,它会很好。
第 1 步:您必须获取查询的实际执行计划并获取 xml:
如果您没有估计值较低的查询,请输入一个不存在的值,例如:
select count(*)
from dbo.T as T
where T.XMLDoc.exist('/SomeText[. eq "bbb"]') = 1
Run Code Online (Sandbox Code Playgroud)
步骤2:将执行计划xml中的所有's替换为''s,我们稍后会用到它。
第 3 步:将计划粘贴到OPTION( USE PLAN '')
SELECT count(*)
FROM dbo.T as T
WHERE T.XMLDoc.exist('/SomeText[. eq "MoreText"]') = 1
OPTION(USE PLAN
'');
Run Code Online (Sandbox Code Playgroud)
第 4 步:我必须将 utf-16 更改为 utf-8 才能使使用计划提示起作用
从:
OPTION(USE PLAN
'<?xml version="1.0" encoding="utf-16"?>
Run Code Online (Sandbox Code Playgroud)
到:
OPTION(USE PLAN
'<?xml version="1.0" encoding="utf-8"?>
Run Code Online (Sandbox Code Playgroud)
第 5 步:运行查询。
我的查询现在看起来像这样:
select count(*)
from dbo.T as T
where T.XMLDoc.exist('/SomeText[. eq "MoreText"]') = 1
OPTION(USE PLAN
'<?xml version="1.0" encoding="utf-8"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.481" Build="14.0.3223.3" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
<BatchSequence>
<Batch>
<Statements>
<StmtSimple StatementCompId="1" StatementEstRows="1" StatementId="1" StatementOptmLevel="FULL" StatementOptmEarlyAbortReason="GoodEnoughPlanFound" CardinalityEstimationModelVersion="70" StatementSubTreeCost="0.00986014" StatementText="select count(*)
from dbo.T as T
where T.XMLDoc.exist(''/SomeText[. eq "bbb"]'') = 1" StatementType="SELECT" QueryHash="0x412154B6AD55BBFC" QueryPlanHash="0x280B174BF20902E3" RetrievedFromCache="true" SecurityPolicyApplied="false">
<StatementSetOptions ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="true" />
<QueryPlan DegreeOfParallelism="1" CachedPlanSize="40" CompileTime="394" CompileCPU="301" CompileMemory="648">
<MemoryGrantInfo SerialRequiredMemory="0" SerialDesiredMemory="0" />
<OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="419404" EstimatedPagesCached="52425" EstimatedAvailableDegreeOfParallelism="2" MaxCompileMemory="3655432" />
<QueryTimeStats CpuTime="0" ElapsedTime="0" />
<RelOp AvgRowSize="11" EstimateCPU="0" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.00986014">
<OutputList>
<ColumnReference Column="Expr1017" />
</OutputList>
<ComputeScalar>
<DefinedValues>
<DefinedValue>
<ColumnReference Column="Expr1017" />
<ScalarOperator ScalarString="CONVERT_IMPLICIT(int,[Expr1020],0)">
<Convert DataType="int" Style="0" Implicit="true">
<ScalarOperator>
<Identifier>
<ColumnReference Column="Expr1020" />
</Identifier>
</ScalarOperator>
</Convert>
</ScalarOperator>
</DefinedValue>
</DefinedValues>
<RelOp AvgRowSize="11" EstimateCPU="1.1E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Aggregate" NodeId="1" Parallel="false" PhysicalOp="Stream Aggregate" EstimatedTotalSubtreeCost="0.00986014">
<OutputList>
<ColumnReference Column="Expr1020" />
</OutputList>
<RunTimeInformation>
<RunTimeCountersPerThread Thread="0" ActualRows="1" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
</RunTimeInformation>
<StreamAggregate>
<DefinedValues>
<DefinedValue>
<ColumnReference Column="Expr1020" />
<ScalarOperator ScalarString="Count(*)">
<Aggregate AggType="countstar" Distinct="false" />
</ScalarOperator>
</DefinedValue>
</DefinedValues>
<RelOp AvgRowSize="9" EstimateCPU="1E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Aggregate" NodeId="3" Parallel="false" PhysicalOp="Stream Aggregate" EstimatedTotalSubtreeCost="0.00985904">
<OutputList />
<RunTimeInformation>
<RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
</RunTimeInformation>
<StreamAggregate>
<DefinedValues />
<GroupBy>
<ColumnReference Database="[adventureworks]" Schema="[dbo]" Table="[T]" Alias="[T]" Column="ID" />
</GroupBy>
<RelOp AvgRowSize="11" EstimateCPU="4.18E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Inner Join" NodeId="4" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="0.00985804">
<OutputList>
<ColumnReference Database="[adventureworks]" Schema="[dbo]" Table="[T]" Alias="[T]" Column="ID" />
</OutputList>
<RunTimeInformation>
<RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
</RunTimeInformation>
<NestedLoops Optimized="false">
<OuterReferences>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
</OuterReferences>
<RelOp AvgRowSize="16" EstimateCPU="4.18E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Inner Join" NodeId="5" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="0.00657038">
<OutputList>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
</OutputList>
<RunTimeInformation>
<RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
</RunTimeInformation>
<NestedLoops Optimized="false">
<OuterReferences>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
</OuterReferences>
<RelOp AvgRowSize="15" EstimateCPU="0.0001581" EstimateIO="0.003125" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" EstimatedRowsRead="1" LogicalOp="Index Seek" NodeId="6" Parallel="false" PhysicalOp="Index Seek" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="150002">
<OutputList>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
</OutputList>
<RunTimeInformation>
<RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" ActualScans="1" ActualLogicalReads="3" ActualPhysicalReads="0" ActualReadAheads="0" ActualLobLogicalReads="0" ActualLobPhysicalReads="0" ActualLobReadAheads="0" />
</RunTimeInformation>
<IndexScan Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" ForceScan="false" NoExpandHint="false" Storage="RowStore">
<DefinedValues>
<DefinedValue>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
</DefinedValue>
<DefinedValue>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
</DefinedValue>
</DefinedValues>
<Object Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Index="[SIX_T_pathXQUERY]" Filtered="true" Alias="[SomeText:1]" IndexKind="SecondarySelectiveXML" Storage="RowStore" />
<SeekPredicates>
<SeekPredicateNew>
<SeekKeys>
<Prefix ScanType="EQ">
<RangeColumns>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pathXQUERY_1_value" />
</RangeColumns>
<RangeExpressions>
<ScalarOperator ScalarString="N''bbb''">
<Const ConstValue="N''bbb''" />
</ScalarOperator>
</RangeExpressions>
</Prefix>
</SeekKeys>
</SeekPredicateNew>
</SeekPredicates>
</IndexScan>
</RelOp>
<RelOp AvgRowSize="461" EstimateCPU="0.0001581" EstimateIO="0.003125" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Clustered Index Seek" NodeId="8" Parallel="false" PhysicalOp="Clustered Index Seek" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="150002">
<OutputList />
<RunTimeInformation>
<RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="0" ActualExecutions="0" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" ActualScans="0" ActualLogicalReads="0" ActualPhysicalReads="0" ActualReadAheads="0" ActualLobLogicalReads="0" ActualLobPhysicalReads="0" ActualLobReadAheads="0" />
</RunTimeInformation>
<IndexScan Lookup="true" Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" ForceScan="false" NoExpandHint="false" Storage="RowStore">
<DefinedValues />
<Object Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Index="[SIX_T]" Alias="[SomeText:1]" TableReferenceId="-1" IndexKind="SelectiveXML" Storage="RowStore" />
<SeekPredicates>
<SeekPredicateNew>
<SeekKeys>
<Prefix ScanType="EQ">
<RangeColumns>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
</RangeColumns>
<RangeExpressions>
<ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[pk1] as [SomeText:1].[pk1]">
<Identifier>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
</Identifier>
</ScalarOperator>
<ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[row_id] as [SomeText:1].[row_id]">
<Identifier>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
</Identifier>
</ScalarOperator>
</RangeExpressions>
</Prefix>
</SeekKeys>
</SeekPredicateNew>
</SeekPredicates>
<Predicate>
<ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[path_1_id] as [SomeText:1].[path_1_id] IS NOT NULL">
<Logical Operation="IS NOT NULL">
<ScalarOperator>
<Identifier>
<ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="path_1_id" />
</Identifier>
</ScalarOperator>
</Logical>
</ScalarOperator>
</Predicate>
</IndexScan>
</RelOp>
</NestedLoops>
</RelOp>
<
问题是,我怎样才能让 xml 计数和按列计数一样快?
根据您需要使用 xPath 表达式的灵活程度,您可以使用属性提升。
create function dbo.GetSomeText(@X xml) returns varchar(8) with schemabinding as
begin
return @X.value('(/SomeText/text())[1]', 'varchar(8)');
end;
go
alter table dbo.T add SomeTextColumn as dbo.GetSomeText(XMLDoc);
go
create index T_IX_SomeText on dbo.T(SomeTextColumn);
go
select count(T.ID)
from dbo.T as T
where T.SomeTextColumn = 'MoreText';
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
530 次 |
| 最近记录: |