Xml 索引,行数慢

Som*_*123 9 performance index xml sql-server execution-plan query-performance

在以下情况下,我没有得到索引查找。而不是<Number>xxx</Number>像这篇文章那样插入到 xml 列中为什么当 where 子句过滤 `value()` 时不使用二级选择性索引?, 用这个 xml 插入 100k 行<SomeText>NiceText</SomeText>和类似数量的 this <SomeText>MoreText</SomeText>。不需要是100k。只需要很多。然后添加索引

create selective xml index SIX_T on dbo.T(XMLDoc) for
(
    pathXQUERY = '/SomeText' as xquery 'xs:string' maxlength(8) singleton
);
Run Code Online (Sandbox Code Playgroud)

和二级索引

create xml index SIX_T_pathXQUERY on dbo.T(XMLDoc)
  using xml index SIX_T for (pathXQUERY);
Run Code Online (Sandbox Code Playgroud)

然后数数

select count(*)
from dbo.T as T
where T.XMLDoc.exist('/SomeText[. eq "MoreText"]') = 1;
Run Code Online (Sandbox Code Playgroud)

请注意,它不使用索引查找并且“慢”。数百万行可能需要几秒钟。如果我在标准列中插入相同的值并向其添加索引并执行

select count(id)
    from dbo.T as T where SomeTextColumn = 'MoreText'
Run Code Online (Sandbox Code Playgroud)

我立即得到结果。在 sql server 18.3.1 上完成的所有测试

问题是,我怎样才能让 xml 计数和按列计数一样快?

谢谢

Ran*_*gen 7

数据差异

由于要查找的记录数量很少,优化器可以使用SIX_T_pathXQUERY索引:

在此处输入图片说明

moretext使用搜索谓词过滤:

在此处输入图片说明

这里有趣的部分是它执行键查找以获取不为空的 path_1_id 值。

在此处输入图片说明

由于这是对非聚集 xml 索引的过滤器定义...

在此处输入图片说明

...虽然不存在于索引本身中。

因此,对于考虑使用索引的优化器,它知道必须完成以下步骤:

  • 使用内部表上的二级索引过滤 XML 值
  • 将这些返回值与SIX_T内表上的聚集索引相匹配并过滤,path_1_id is not null因为path_1_id不包含在二级索引中
  • 将这些值与实际表匹配,dbo.T以返回要依赖的 ID

引爆点

当要返回的预期行数更高时,它倾向于使用选择性聚集 XML 索引,以便能够运行合并连接和无键查找:

在此处输入图片说明

带有残差谓词:

在此处输入图片说明

过滤 xml 列和 path_1_id

比较计划

您可以(但不应该)使用USE PLAN提示来强制执行 XML 列上的查找的计划,看看如果我们在哪里搜索这些值会发生什么。

在此处输入图片说明

随着执行时间 =

   CPU time = 218 ms,  elapsed time = 215 ms.
Run Code Online (Sandbox Code Playgroud)

以及扫描+合并连接计划的执行时间:

   CPU time = 62 ms,  elapsed time = 58 ms.
Run Code Online (Sandbox Code Playgroud)

简而言之,我相信使用残差谓词扫描 + 合并连接选择是优化器的正确选择。

无路可走

虽然我可能是错的,但我认为没有办法使用常规 XML 索引改进计数查询。我们也无法更改这些内部表,甚至无法查询它们:

SELECT * FROM 
[sys].[xml_sxi_table_1463676262_256000];
Run Code Online (Sandbox Code Playgroud)

它甚至在合并连接查询计划中提供了一个缺失的索引提示,您无法创建它:

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [sys].[xml_sxi_table_1463676262_256000] ([pathXQUERY_1_value])
INCLUDE ([path_1_id])
Run Code Online (Sandbox Code Playgroud)

要改进查询,您必须研究非 xml 索引解决方案。


编辑

如何使用 USE PLAN 提示?您有此查询的示例代码吗?

我会建议不要这样做,但出于测试目的,它会很好。

第 1 步:您必须获取查询的实际执行计划并获取 xml:

在此处输入图片说明

如果您没有估计值较低的查询,请输入一个不存在的值,例如:

select count(*)
from dbo.T as T
where T.XMLDoc.exist('/SomeText[. eq "bbb"]') = 1
Run Code Online (Sandbox Code Playgroud)

步骤2:将执行计划xml中的所有's替换为''s,我们稍后会用到它。

在此处输入图片说明

第 3 步:将计划粘贴到OPTION( USE PLAN '')

SELECT count(*)
FROM dbo.T as T
WHERE T.XMLDoc.exist('/SomeText[. eq "MoreText"]') = 1
OPTION(USE PLAN 
'');
Run Code Online (Sandbox Code Playgroud)

第 4 步:我必须将 utf-16 更改为 utf-8 才能使使用计划提示起作用

从:

OPTION(USE PLAN
'<?xml version="1.0" encoding="utf-16"?>
Run Code Online (Sandbox Code Playgroud)

到:

OPTION(USE PLAN
'<?xml version="1.0" encoding="utf-8"?>
Run Code Online (Sandbox Code Playgroud)

第 5 步:运行查询。

我的查询现在看起来像这样:

select count(*)
from dbo.T as T
where T.XMLDoc.exist('/SomeText[. eq "MoreText"]') = 1
OPTION(USE PLAN
'<?xml version="1.0" encoding="utf-8"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.481" Build="14.0.3223.3" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementCompId="1" StatementEstRows="1" StatementId="1" StatementOptmLevel="FULL" StatementOptmEarlyAbortReason="GoodEnoughPlanFound" CardinalityEstimationModelVersion="70" StatementSubTreeCost="0.00986014" StatementText="select count(*)&#xD;&#xA;from dbo.T as T&#xD;&#xA;where T.XMLDoc.exist(''/SomeText[. eq &quot;bbb&quot;]'') = 1" StatementType="SELECT" QueryHash="0x412154B6AD55BBFC" QueryPlanHash="0x280B174BF20902E3" RetrievedFromCache="true" SecurityPolicyApplied="false">
          <StatementSetOptions ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="true" />
          <QueryPlan DegreeOfParallelism="1" CachedPlanSize="40" CompileTime="394" CompileCPU="301" CompileMemory="648">
            <MemoryGrantInfo SerialRequiredMemory="0" SerialDesiredMemory="0" />
            <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="419404" EstimatedPagesCached="52425" EstimatedAvailableDegreeOfParallelism="2" MaxCompileMemory="3655432" />
            <QueryTimeStats CpuTime="0" ElapsedTime="0" />
            <RelOp AvgRowSize="11" EstimateCPU="0" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.00986014">
              <OutputList>
                <ColumnReference Column="Expr1017" />
              </OutputList>
              <ComputeScalar>
                <DefinedValues>
                  <DefinedValue>
                    <ColumnReference Column="Expr1017" />
                    <ScalarOperator ScalarString="CONVERT_IMPLICIT(int,[Expr1020],0)">
                      <Convert DataType="int" Style="0" Implicit="true">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Column="Expr1020" />
                          </Identifier>
                        </ScalarOperator>
                      </Convert>
                    </ScalarOperator>
                  </DefinedValue>
                </DefinedValues>
                <RelOp AvgRowSize="11" EstimateCPU="1.1E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Aggregate" NodeId="1" Parallel="false" PhysicalOp="Stream Aggregate" EstimatedTotalSubtreeCost="0.00986014">
                  <OutputList>
                    <ColumnReference Column="Expr1020" />
                  </OutputList>
                  <RunTimeInformation>
                    <RunTimeCountersPerThread Thread="0" ActualRows="1" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
                  </RunTimeInformation>
                  <StreamAggregate>
                    <DefinedValues>
                      <DefinedValue>
                        <ColumnReference Column="Expr1020" />
                        <ScalarOperator ScalarString="Count(*)">
                          <Aggregate AggType="countstar" Distinct="false" />
                        </ScalarOperator>
                      </DefinedValue>
                    </DefinedValues>
                    <RelOp AvgRowSize="9" EstimateCPU="1E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Aggregate" NodeId="3" Parallel="false" PhysicalOp="Stream Aggregate" EstimatedTotalSubtreeCost="0.00985904">
                      <OutputList />
                      <RunTimeInformation>
                        <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
                      </RunTimeInformation>
                      <StreamAggregate>
                        <DefinedValues />
                        <GroupBy>
                          <ColumnReference Database="[adventureworks]" Schema="[dbo]" Table="[T]" Alias="[T]" Column="ID" />
                        </GroupBy>
                        <RelOp AvgRowSize="11" EstimateCPU="4.18E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Inner Join" NodeId="4" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="0.00985804">
                          <OutputList>
                            <ColumnReference Database="[adventureworks]" Schema="[dbo]" Table="[T]" Alias="[T]" Column="ID" />
                          </OutputList>
                          <RunTimeInformation>
                            <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
                          </RunTimeInformation>
                          <NestedLoops Optimized="false">
                            <OuterReferences>
                              <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                            </OuterReferences>
                            <RelOp AvgRowSize="16" EstimateCPU="4.18E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Inner Join" NodeId="5" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="0.00657038">
                              <OutputList>
                                <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                              </OutputList>
                              <RunTimeInformation>
                                <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
                              </RunTimeInformation>
                              <NestedLoops Optimized="false">
                                <OuterReferences>
                                  <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                  <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                                </OuterReferences>
                                <RelOp AvgRowSize="15" EstimateCPU="0.0001581" EstimateIO="0.003125" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" EstimatedRowsRead="1" LogicalOp="Index Seek" NodeId="6" Parallel="false" PhysicalOp="Index Seek" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="150002">
                                  <OutputList>
                                    <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                    <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                                  </OutputList>
                                  <RunTimeInformation>
                                    <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" ActualScans="1" ActualLogicalReads="3" ActualPhysicalReads="0" ActualReadAheads="0" ActualLobLogicalReads="0" ActualLobPhysicalReads="0" ActualLobReadAheads="0" />
                                  </RunTimeInformation>
                                  <IndexScan Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" ForceScan="false" NoExpandHint="false" Storage="RowStore">
                                    <DefinedValues>
                                      <DefinedValue>
                                        <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                      </DefinedValue>
                                      <DefinedValue>
                                        <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                                      </DefinedValue>
                                    </DefinedValues>
                                    <Object Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Index="[SIX_T_pathXQUERY]" Filtered="true" Alias="[SomeText:1]" IndexKind="SecondarySelectiveXML" Storage="RowStore" />
                                    <SeekPredicates>
                                      <SeekPredicateNew>
                                        <SeekKeys>
                                          <Prefix ScanType="EQ">
                                            <RangeColumns>
                                              <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pathXQUERY_1_value" />
                                            </RangeColumns>
                                            <RangeExpressions>
                                              <ScalarOperator ScalarString="N''bbb''">
                                                <Const ConstValue="N''bbb''" />
                                              </ScalarOperator>
                                            </RangeExpressions>
                                          </Prefix>
                                        </SeekKeys>
                                      </SeekPredicateNew>
                                    </SeekPredicates>
                                  </IndexScan>
                                </RelOp>
                                <RelOp AvgRowSize="461" EstimateCPU="0.0001581" EstimateIO="0.003125" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Clustered Index Seek" NodeId="8" Parallel="false" PhysicalOp="Clustered Index Seek" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="150002">
                                  <OutputList />
                                  <RunTimeInformation>
                                    <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="0" ActualExecutions="0" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" ActualScans="0" ActualLogicalReads="0" ActualPhysicalReads="0" ActualReadAheads="0" ActualLobLogicalReads="0" ActualLobPhysicalReads="0" ActualLobReadAheads="0" />
                                  </RunTimeInformation>
                                  <IndexScan Lookup="true" Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" ForceScan="false" NoExpandHint="false" Storage="RowStore">
                                    <DefinedValues />
                                    <Object Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Index="[SIX_T]" Alias="[SomeText:1]" TableReferenceId="-1" IndexKind="SelectiveXML" Storage="RowStore" />
                                    <SeekPredicates>
                                      <SeekPredicateNew>
                                        <SeekKeys>
                                          <Prefix ScanType="EQ">
                                            <RangeColumns>
                                              <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                              <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                                            </RangeColumns>
                                            <RangeExpressions>
                                              <ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[pk1] as [SomeText:1].[pk1]">
                                                <Identifier>
                                                  <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                                </Identifier>
                                              </ScalarOperator>
                                              <ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[row_id] as [SomeText:1].[row_id]">
                                                <Identifier>
                                                  <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                                                </Identifier>
                                              </ScalarOperator>
                                            </RangeExpressions>
                                          </Prefix>
                                        </SeekKeys>
                                      </SeekPredicateNew>
                                    </SeekPredicates>
                                    <Predicate>
                                      <ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[path_1_id] as [SomeText:1].[path_1_id] IS NOT NULL">
                                        <Logical Operation="IS NOT NULL">
                                          <ScalarOperator>
                                            <Identifier>
                                              <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="path_1_id" />
                                            </Identifier>
                                          </ScalarOperator>
                                        </Logical>
                                      </ScalarOperator>
                                    </Predicate>
                                  </IndexScan>
                                </RelOp>
                              </NestedLoops>
                            </RelOp>
                            <


Mik*_*son 6

问题是,我怎样才能让 xml 计数和按列计数一样快?

根据您需要使用 xPath 表达式的灵活程度,您可以使用属性提升

create function dbo.GetSomeText(@X xml) returns varchar(8) with schemabinding as
begin
  return @X.value('(/SomeText/text())[1]', 'varchar(8)');
end;

go

alter table dbo.T add SomeTextColumn as dbo.GetSomeText(XMLDoc);

go

create index T_IX_SomeText on dbo.T(SomeTextColumn);

go

select count(T.ID)
from dbo.T as T
where T.SomeTextColumn = 'MoreText';
Run Code Online (Sandbox Code Playgroud)

在此处输入图片说明