为什么 SQL Server 有时会估计加入空表会增加行数?

Mar*_*ith 6 performance sql-server execution-plan tsqlt

我最近遇到一个问题,tSQLt测试需要很长时间才能运行。

所测试的过程正在执行 38 个表 (!) 连接(具有 37 个伪造的表和一个表值参数)。

只有两个伪造的表和 TVP 插入了任何行

编译时间非常慢。

显示跟踪标志 8675

End of simplification, time: 0.002 net: 0.002 total: 0 net: 0.002
end exploration, tasks: 549 no total cost time: 0.013 net: 0.013 total: 0 net: 0.015
end search(0),  cost: 13372.9 tasks: 3517 time: 0.012 net: 0.012 total: 0 net: 0.028
end exploration, tasks: 3983 Cost = 13372.9 time: 0 net: 0 total: 0 net: 0.028
end search(1),  cost: 6706.79 tasks: 10187 time: 0.024 net: 0.024 total: 0 net: 0.052
end exploration, tasks: 10188 Cost = 6706.79 time: 0 net: 0 total: 0 net: 0.052
end search(1),  cost: 6706.79 tasks: 61768 time: 0.165 net: 0.165 total: 0 net: 0.218
*** Optimizer time out abort at task 614400 ***
end search(2),  cost: 6706.79 tasks: 614400 time: 12.539 net: 12.539 total: 12 net: 12.758
*** Optimizer time out abort at task 614400 ***
End of post optimization rewrite, time: 0.001 net: 0.001 total: 12 net: 12.759
End of query plan compilation, time: 0.003 net: 0.003 total: 12 net: 12.762
SQL Server parse and compile time: 
   CPU time = 12735 ms, elapsed time = 12770 ms.
Run Code Online (Sandbox Code Playgroud)

看起来空表之间的每个连接的估计行数呈指数增长,直到最后估计行数为 135,601,000,并且查询的估计成本巨大,需要更长的编译时间。 在此输入图像描述

许多此类联接涉及一个特定的表,并且向该表插入一行足以阻止该表涉及的联接的爆炸式增长(基数估计器输出表明它现在正在使用该表中的统计直方图) )

在此输入图像描述

最初的行为对我来说似乎很奇怪。SQL Server 知道它要连接的表是空的,并且计划缓存白皮书指出向空表插入任何行都会导致达到重新编译阈值,那么这有什么充分的理由吗?

显示估计行数增长的重现(尽管没有很长的编译时间)

在此输入图像描述

CREATE TABLE T1(C1 INT);

INSERT INTO T1 VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9);

CREATE TABLE T2(C1 INT, C2 VARCHAR(MAX));

SELECT *
FROM T1 LEFT OUTER JOIN T2 ON T1.C1 = T2.C1
        LEFT OUTER JOIN T2 T3 ON T3.C1 = T2.C1
        LEFT OUTER JOIN T2 T4 ON T4.C1 = T2.C1
        LEFT OUTER JOIN T2 T5 ON T5.C1 = T2.C1
        LEFT OUTER JOIN T2 T6 ON T6.C1 = T2.C1
        LEFT OUTER JOIN T2 T7 ON T7.C1 = T2.C1
        LEFT OUTER JOIN T2 T8 ON T8.C1 = T2.C1
        LEFT OUTER JOIN T2 T9 ON T9.C1 = T2.C1
        LEFT OUTER JOIN T2 T10 ON T10.C1 = T2.C1
        LEFT OUTER JOIN T2 T11 ON T11.C1 = T2.C1
        LEFT OUTER JOIN T2 T12 ON T12.C1 = T2.C1
        LEFT OUTER JOIN T2 T13 ON T13.C1 = T2.C1
        LEFT OUTER JOIN T2 T14 ON T14.C1 = T2.C1
        LEFT OUTER JOIN T2 T15 ON T15.C1 = T2.C1
        LEFT OUTER JOIN T2 T16 ON T16.C1 = T2.C1
        LEFT OUTER JOIN T2 T17 ON T17.C1 = T2.C1
        LEFT OUTER JOIN T2 T18 ON T18.C1 = T2.C1
        LEFT OUTER JOIN T2 T19 ON T19.C1 = T2.C1
        LEFT OUTER JOIN T2 T20 ON T20.C1 = T2.C1
        LEFT OUTER JOIN T2 T21 ON T21.C1 = T2.C1
        LEFT OUTER JOIN T2 T22 ON T22.C1 = T2.C1
        LEFT OUTER JOIN T2 T23 ON T23.C1 = T2.C1
        LEFT OUTER JOIN T2 T24 ON T24.C1 = T2.C1
        LEFT OUTER JOIN T2 T25 ON T25.C1 = T2.C1
        LEFT OUTER JOIN T2 T26 ON T26.C1 = T2.C1
        LEFT OUTER JOIN T2 T27 ON T27.C1 = T2.C1
        LEFT OUTER JOIN T2 T28 ON T28.C1 = T2.C1
        LEFT OUTER JOIN T2 T29 ON T29.C1 = T2.C1
        LEFT OUTER JOIN T2 T30 ON T30.C1 = T2.C1
        LEFT OUTER JOIN T2 T31 ON T31.C1 = T2.C1
        LEFT OUTER JOIN T2 T32 ON T32.C1 = T2.C1
        LEFT OUTER JOIN T2 T33 ON T33.C1 = T2.C1
        LEFT OUTER JOIN T2 T34 ON T34.C1 = T2.C1
        LEFT OUTER JOIN T2 T35 ON T35.C1 = T2.C1
        LEFT OUTER JOIN T2 T36 ON T36.C1 = T2.C1
        LEFT OUTER JOIN T2 T37 ON T37.C1 = T2.C1
        LEFT OUTER JOIN T2 T38 ON T38.C1 = T2.C1
        LEFT OUTER JOIN T2 T39 ON T39.C1 = T2.C1
Run Code Online (Sandbox Code Playgroud)

Pau*_*ite 4

我不知道为什么当“组合不同计数”为 1 时, “新”(或默认值,如 Microsoft 那样)基数估计器会采用 50% 的选择性猜测,但它确实如此:

Begin selectivity computation

Input tree:

  LogOp_LeftOuterJoin
      CStCollOuterJoin(ID=40, CARD=9 x_jtLeftOuter)
          CStCollBaseTable(ID=1, CARD=9 TBL: T1)
          CStCollBaseTable(ID=2, CARD=1 TBL: T2)
      CStCollBaseTable(ID=3, CARD=1 TBL: T2 AS TBL: T3)
      ScaOp_Comp x_cmpEq
          ScaOp_Identifier QCOL: [T3].C1
          ScaOp_Identifier QCOL: [Sandpit].[dbo].[T2].C1

Plan for computation:
  CSelCalcSimpleJoinWithDistinctCounts (Using base cardinality)
      CDVCPlanJoin
          Plan for non-join columns (Right)
              CDVCPlanLeaf
                  0 Multi-Column Stats, 0 Single-Column Stats, 1 Guesses
      CDVCPlanLeaf
          0 Multi-Column Stats, 0 Single-Column Stats, 1 Guesses

Using ambient cardinality 1 to combine distinct counts:
  1

Using ambient cardinality 1 to combine distinct counts:
  1

Selectivity: 0.5
Run Code Online (Sandbox Code Playgroud)
Stats collection generated: 

  CStCollOuterJoin(ID=41, CARD=13.3889 x_jtLeftOuter)
      CStCollOuterJoin(ID=40, CARD=9 x_jtLeftOuter)
          CStCollBaseTable(ID=1, CARD=9 TBL: T1)
          CStCollBaseTable(ID=2, CARD=1 TBL: T2)
      CStCollBaseTable(ID=3, CARD=1 TBL: T2 AS TBL: T3)
End selectivity computation
Run Code Online (Sandbox Code Playgroud)

“传统”(或更好的,如我所愿)基数估计器不存在此问题。