LeM*_*iek 14 performance sql-server sql-server-2016 cardinality-estimates query-performance
我正在努力理解为什么行估计如此错误,这是我的情况:
简单联接 - 使用 SQL Server 2016 sp2(sp1 上的相同问题),dbcompatibity=130。
select Amount_TransactionCurrency_id, CurrencyShareds.id
from CurrencyShareds
INNER JOIN annexes ON Amount_TransactionCurrency_id = CurrencyShareds.Id
option (QUERYTRACEON 3604, QUERYTRACEON 2363);
Run Code Online (Sandbox Code Playgroud)
SQL 估计 1 行,而它是 107131 并选择执行嵌套循环(链接到计划)。在 CurrencyShareds 上更新统计信息后,估计就可以了,并选择合并连接(链接到新计划)。只要将一条记录添加到 CurrencyShareds,统计信息就会变得“陈旧”并且 sql 返回错误估计。
我不会太担心这个简单的查询,但这只是更大查询的一部分,这是多米诺骨牌的开始......
为什么在 100 条记录表中添加一行会导致这样的损坏?在查看基数估计跟踪的输出时,我看到了这个警告,***WARNING: badly-formed histogram ***但我找不到关于这个主题的更多信息。
以下是基数估计的完整输出:
Begin selectivity computation
Input tree:
LogOp_Join
CStCollBaseTable(ID=1, CARD=107131 TBL: annexes)
CStCollBaseTable(ID=2, CARD=100 TBL: CurrencyShareds)
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id
ScaOp_Identifier QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id
Plan for computation:
CSelCalcExpressionComparedToExpression( QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id x_cmpEq QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id )
Loaded histogram for column QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id from stats with id 7
Loaded histogram for column QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id from stats with id 1 *** WARNING: badly-formed histogram ***
Selectivity: 4.59503e-018
Stats collection generated:
CStCollJoin(ID=3, CARD=1 x_jtInner)
CStCollBaseTable(ID=1, CARD=107131 TBL: annexes)
CStCollBaseTable(ID=2, CARD=100 TBL: CurrencyShareds)
End selectivity computation
Estimating distinct count in utility function
Input stats collection:
CStCollBaseTable(ID=1, CARD=107131 TBL: annexes)
Columns to distinct on:QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id
Plan for computation:
CDVCPlanLeaf
0 Multi-Column Stats, 1 Single-Column Stats, 0 Guesses
Covering multi-col stats id: 7
Using ambient cardinality 107131 to combine distinct counts:
5
Combined distinct count: 5
Result of computation: 5
Estimating distinct count in utility function
Input stats collection:
CStCollBaseTable(ID=2, CARD=100 TBL: CurrencyShareds)
Columns to distinct on:QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id
Plan for computation:
CDVCPlanUniqueKey
Result of computation: 100
Run Code Online (Sandbox Code Playgroud)
当我更新 CurrencyShareds 的统计数据时,“直方图格式错误”的部分会发生变化,并且基数计算正确
Plan for computation:
CSelCalcExpressionComparedToExpression( QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id x_cmpEq QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id )
Loaded histogram for column QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id from stats with id 7
Loaded histogram for column QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id from stats with id 1
Selectivity: 0.01
Stats collection generated:
CStCollJoin(ID=3, CARD=107131 x_jtInner)
CStCollBaseTable(ID=1, CARD=107131 TBL: annexes)
CStCollBaseTable(ID=2, CARD=100 TBL: CurrencyShareds)
End selectivity computation
Run Code Online (Sandbox Code Playgroud)
以及此“[CurrencyShareds].Id from stats with id 1”的统计信息以及有关直方图的警告,这对我来说看起来不错......
Name Updated Rows Rows Sampled Steps Density Average key length String Index Filter Expression Unfiltered Rows Persisted Sample Percent
-------------------------------------------------------------------------------------------------------------------------------- -------------------- -------------------- -------------------- ------ ------------- ------------------ ------------ ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------- ------------------------
PK_CurrencyShareds_Id May 23 2018 10:43PM 98 98 75 1 8 NO NULL 98 0
(1 row affected)
All density Average Length Columns
------------- -------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0,01020408 8 Id
(1 row affected)
RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS
-------------------- ------------- ------------- -------------------- --------------
119762190797406464 0 1 0 1
119762190797406466 1 1 1 1
119762190797406468 1 1 1 1
119762190797406470 1 1 1 1
119762190797406472 1 1 1 1
119762190797406474 1 1 1 1
119762190797406476 1 1 1 1
119762190797406478 1 1 1 1
119762190797406480 1 1 1 1
119762190797406482 1 1 1 1
119762190797406484 1 1 1 1
119762190797406486 1 1 1 1
119762190797406488 1 1 1 1
119762190797406490 1 1 1 1
119762190797406492 1 1 1 1
119762190797406494 1 1 1 1
119762190797406496 1 1 1 1
119762190797406498 1 1 1 1
119762190797406500 1 1 1 1
119762190797406502 1 1 1 1
119762190797406504 1 1 1 1
119762190797406506 1 1 1 1
119762190797406507 0 1 0 1
478531702587687680 0 1 0 1
478531702591881728 0 1 0 1
478531702591881729 0 1 0 1
478531702591881984 0 1 0 1
478531702591881985 0 1 0 1
478531702596076032 0 1 0 1
478531702596076033 0 1 0 1
478531702596076288 0 1 0 1
478531702600270336 0 1 0 1
478531702600270592 0 1 0 1
478532235583062528 0 1 0 1
478532235583062784 0 1 0 1
478532235587256832 0 1 0 1
530792464911467264 0 1 0 1
530792464924049920 0 1 0 1
530792464924050176 0 1 0 1
530792464928244224 0 1 0 1
530792464928244480 0 1 0 1
530792464932438528 0 1 0 1
530792464932438784 0 1 0 1
530792464936632832 0 1 0 1
530792464936632833 0 1 0 1
530792464936633088 0 1 0 1
530792464940827136 0 1 0 1
530792464940827392 0 1 0 1
530792464949216000 2 1 2 1
530792464953410048 0 1 0 1
530792464953410304 0 1 0 1
530792464957604352 0 1 0 1
530792464957604353 0 1 0 1
530792464957604608 0 1 0 1
530792464961798656 0 1 0 1
530792464961798912 0 1 0 1
530792464965992960 0 1 0 1
530792464965993216 0 1 0 1
530792464965993217 0 1 0 1
530792464970187264 0 1 0 1
530792464970187265 0 1 0 1
530792464970187520 0 1 0 1
530792464974381568 0 1 0 1
530792464974381824 0 1 0 1
530792464974381825 0 1 0 1
530792464978575872 0 1 0 1
530792464978575873 0 1 0 1
530792464978576128 0 1 0 1
867420708903354880 0 1 0 1
867420708903355136 0 1 0 1
867420708903355137 0 1 0 1
960876568220042240 0 1 0 1
976385263448130048 0 1 0 1
977302121709864192 0 1 0 1
977955748426318592 0 1 0 1
Run Code Online (Sandbox Code Playgroud)
和第二个索引的信息:
Name Updated Rows Rows Sampled Steps Density Average key length String Index Filter Expression Unfiltered Rows Persisted Sample Percent
-------------------------------------------------------------------------------------------------------------------------------- -------------------- -------------------- -------------------- ------ ------------- ------------------ ------------ ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------- ------------------------
IX_FK_Amount_TransactionCurrency May 21 2018 3:29PM 107204 107204 5 0 16 NO NULL 107204 0
(1 row affected)
All density Average Length Columns
------------- -------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0,2 8 Amount_TransactionCurrency_id
9,32801E-06 16 Amount_TransactionCurrency_id, Id
(2 rows affected)
RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS
-------------------- ------------- ------------- -------------------- --------------
119762190797406475 0 160 0 1
119762190797406478 0 867 0 1
119762190797406481 0 106 0 1
119762190797406494 0 105742 0 1
119762190797406496 0 329 0 1
Run Code Online (Sandbox Code Playgroud)
Joe*_*ish 10
根据您的直方图,我能够在 2017 CU6 中重现该问题。我不会说你做错了什么。相反,基数估计出了问题。这是我在插入一行之前得到的:
插入一行后,最终的基数估计下降了很多:
您在这里有一个非常简单的重现,因此我的建议是提交产品反馈或与 Microsoft 开立支持票。我找到了一些适用于您的示例数据的解决方法,其中一种可能对您来说是可以接受的。
CurrencyShareds.Id。如果没有唯一索引,我就无法让 repro 工作。桌子很小,所以也许你可以不用索引。当然,您可能有很好的理由保留它。.
select Amount_TransactionCurrency_id, CurrencyShareds.id
from CurrencyShareds
INNER JOIN annexes
ON Amount_TransactionCurrency_id % 9223372036854775809 = CurrencyShareds.Id % 9223372036854775809
Run Code Online (Sandbox Code Playgroud)
我怀疑这是有效的,因为 CE 似乎使用密度而不是直方图。其他类似的重写可能具有相同的效果。不能保证该类型的查询将来会继续正常工作。这就是为什么您应该联系 Microsoft 以提高某一天对您的问题的修复将其纳入已发布产品的可能性。
好的,我希望我现在明白了 - 这就是我们的情况
CSelCalcExpressionComparedToExpression( QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id x_cmpEq QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id )
Loaded histogram for column QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id from stats with id 7
Loaded histogram for column QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id from stats with id 1
Selectivity: 0.01
Run Code Online (Sandbox Code Playgroud)
然后为连接计算的选择性很好,因为 100 * 107,131 * 0.01 = 107,131
CSelCalcExpressionComparedToExpression( QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id x_cmpEq QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id )
Loaded histogram for column QCOL: [test.MasterData].[dbo].[Annexes].Amount_TransactionCurrency_id from stats with id 7
Loaded histogram for column QCOL: [test.MasterData].[dbo].[CurrencyShareds].Id from stats with id 1 *** WARNING: badly-formed histogram ***
Selectivity: 4.59503e-018
Run Code Online (Sandbox Code Playgroud)
选择性急剧下降,因此连接的估计行数为 1。
在我向引用具有高 ID 的 CurrencyShared 的附件添加一行后,结果 IX_FK_Amount_TransactionCurrency 的直方图更改为
RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS
-------------------- ------------- ------------- -------------------- --------------
119762190797406475 0 173 0 1
119762190797406478 0 868 0 1
119762190797406481 0 107 0 1
119762190797406494 0 105745 0 1
119762190797406496 0 330 0 1
119762190797406618 0 1 0 1
119762190797406628 0 1 0 1
977955748426318623 0 1 0 1
Run Code Online (Sandbox Code Playgroud)
有了这个直方图,问题就消失了,现在向currencyshareds 添加新行不会导致基数估计的急剧下降。
我怀疑这就是粗直方图估计算法在 sql2014+ 中的工作方式,我的猜测基于这篇很棒的帖子https://www.sqlshack.com/join-estimation-internals/
粗直方图估计是一种新算法,即使在一般概念方面也较少记录。众所周知,它不是逐步对齐直方图,而是仅将它们与最小和最大直方图边界对齐。这种方法可能会引入较少的 CE 错误(但并非总是如此,因为我们记得这只是一个模型)。
这很简单 - 我们的 id 是全球唯一的,并且部分基于时间戳(基于snowflake 的实现)。最常见的货币是几年前在应用程序开始时添加的,只有少数真正用于生产,这就是为什么在直方图中只有那些具有“低”ID 的货币。
问题出现在我们的测试环境中,一些自动化测试开始添加测试货币,导致一些查询执行时间更长或超时......
我们将更频繁地更新这些参考表的统计数据(我们可能在其他类似的参考数据表中也有类似的问题)——这些表很小,所以更新统计数据不是问题
| 归档时间: |
|
| 查看次数: |
1932 次 |
| 最近记录: |