Bak*_*won 14 parameters performance select join sql-server-2008
我有3个版本的查询,最终返回相同的结果.
当向一个相对较小的表添加额外的内部连接时,其中一个变得非常慢,并且在where子句中使用参数变量.
对于快速和慢速查询(包括在每个查询下面),执行计划是非常不同的.
我想了解为什么会发生这种情况以及如何防止它.
此查询需要<1秒.它没有额外的内连接,但它在where子句中使用参数变量.
declare @start datetime = '20120115'
declare @end datetime = '20120116'
select distinct sups.campaignid
from tblSupporterMainDetails sups
inner join tblCallLogs calls on sups.supporterid = calls.supporterid
where calls.callEnd between @start and @end
|--Parallelism(Gather Streams)
|--Sort(DISTINCT ORDER BY:([sups].[campaignID] ASC))
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[campaignID]))
|--Hash Match(Partial Aggregate, HASH:([sups].[campaignID]))
|--Hash Match(Inner Join, HASH:([calls].[supporterID])=([sups].[supporterID]))
|--Bitmap(HASH:([calls].[supporterID]), DEFINE:([Bitmap1004]))
| |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([calls].[supporterID]))
| |--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_callend_supporterid] AS [calls]), SEEK:([calls].[callEnd] >= '2012-01-15 00:00:00.000' AND [calls].[callEnd] <= '2012-01-16 00:00:00.000') ORDERED FORWARD)
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[supporterID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), WHERE:(PROBE([Bitmap1004],[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID],N'[IN ROW]')))
Run Code Online (Sandbox Code Playgroud)
此查询需要<1秒.它有一个额外的内连接BUT在where子句中使用参数常量.
select distinct camps.campaignid
from tblCampaigns camps
inner join tblSupporterMainDetails sups on camps.campaignid = sups.campaignid
inner join tblCallLogs calls on sups.supporterid = calls.supporterid
where calls.callEnd between '20120115' and '20120116'
|--Parallelism(Gather Streams)
|--Hash Match(Right Semi Join, HASH:([sups].[campaignID])=([camps].[campaignID]))
|--Bitmap(HASH:([sups].[campaignID]), DEFINE:([Bitmap1007]))
| |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[campaignID]))
| |--Hash Match(Partial Aggregate, HASH:([sups].[campaignID]))
| |--Hash Match(Inner Join, HASH:([calls].[supporterID])=([sups].[supporterID]))
| |--Bitmap(HASH:([calls].[supporterID]), DEFINE:([Bitmap1006]))
| | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([calls].[supporterID]))
| | |--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_callend_supporterid] AS [calls]), SEEK:([calls].[callEnd] >= '2012-01-15 00:00:00.000' AND [calls].[callEnd] <= '2012-01-16 00:00:00.000') ORDERED FORWARD)
| |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[supporterID]))
| |--Index Scan(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), WHERE:(PROBE([Bitmap1006],[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID],N'[IN ROW]')))
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([camps].[campaignID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblCampaigns].[IX_tblCampaigns_isActive] AS [camps]), WHERE:(PROBE([Bitmap1007],[GOGEN].[dbo].[tblCampaigns].[campaignID] as [camps].[campaignID],N'[IN ROW]')))
Run Code Online (Sandbox Code Playgroud)
此查询需要2分钟.它有一个额外的内连接,它在where子句中使用参数变量.
declare @start datetime = '20120115'
declare @end datetime = '20120116'
select distinct camps.campaignid
from tblCampaigns camps
inner join tblSupporterMainDetails sups on camps.campaignid = sups.campaignid
inner join tblCallLogs calls on sups.supporterid = calls.supporterid
where calls.callEnd between @start and @end
|--Nested Loops(Inner Join, OUTER REFERENCES:([camps].[campaignID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblCampaigns].[IX_tblCampaigns_isActive] AS [camps]))
|--Top(TOP EXPRESSION:((1)))
|--Nested Loops(Inner Join, OUTER REFERENCES:([calls].[callID], [Expr1007]) OPTIMIZED WITH UNORDERED PREFETCH)
|--Nested Loops(Inner Join, OUTER REFERENCES:([sups].[supporterID], [Expr1006]) WITH UNORDERED PREFETCH)
| |--Index Seek(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), SEEK:([sups].[campaignID]=[GOGEN].[dbo].[tblCampaigns].[campaignID] as [camps].[campaignID]) ORDERED FORWARD)
| |--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_supporterID_closingCall] AS [calls]), SEEK:([calls].[supporterID]=[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID]) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[AUTOGEN_PK_tblCallLogs] AS [calls]), SEEK:([calls].[callID]=[GOGEN].[dbo].[tblCallLogs].[callID] as [calls].[callID]), WHERE:([GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]>=[@s2] AND [GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]<=[@e2]) LOOKUP ORDERED FORWARD)
Run Code Online (Sandbox Code Playgroud)
tblCallLogs,但我不知道为什么SQL Server会选择这个执行计划.tblCampaigns和tblSupporterMainDetails.这没有效果.campaignid已编制索引.更新:
tblcalllogs.没有效果.DBCC FREEPROCCACHEtblCampaign.campaignid int not null
tblSupporterMainDetails.campaignid int not null
tblSupporterMainDetails.supporterid int not null
tblCallLogs.supporterid int not null
tblCallLogs.callEnd datetime not null
Run Code Online (Sandbox Code Playgroud)

更新2:将索引添加到tblCallLogs.supporterId后 - 使用include列:callEnd
'慢'查询加速最多40秒.更新的执行计划:
|--Nested Loops(Inner Join, OUTER REFERENCES:([camps].[campaignID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblCampaigns].[IX_tblCampaigns_isActive] AS [camps]))
|--Top(TOP EXPRESSION:((1)))
|--Nested Loops(Inner Join, OUTER REFERENCES:([sups].[supporterID], [Expr1006]) WITH UNORDERED PREFETCH)
|--Index Seek(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), SEEK:([sups].[campaignID]=[GOGEN].[dbo].[tblCampaigns].[campaignID] as [camps].[campaignID]) ORDERED FORWARD)
|--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_supporterid_callend] AS [calls]), SEEK:([calls].[supporterID]=[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID]), WHERE:([GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]>=[@s2] AND [GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]<=[@e2]) ORDERED FORWARD)
Run Code Online (Sandbox Code Playgroud)
额外的连接实际上并没有直接导致问题,但它显然改变了语句,以便sql server为它保持不同的执行计划.
通过在慢速语句的末尾添加OPTION(RECOMPILE),我能够获得预期的快速性能.即<1秒.我仍然不确定这个解决方案是否有效 - 为什么没有冲洗所有的计划工作?这是参数嗅探的经典案例吗?我会在知道确切答案时更新这篇文章 - 或者直到有人能给出明确的答案.感谢@LievenKeersmaekers和@JNK迄今为止的帮助......
解决方案的总结:
在 上添加覆盖索引supporterid, callEnd。
这里的假设是优化器可以使用这个索引(与 callEnd、supporterid 相比)来
tblSupporterMainDetails并tblCallLogs where选择子句中使用它callEnd添加选项OPTION(RECOMPILE)
所有 cudo 都归功于 TiborK 和 Hunchback,它们向优化器解释了使用硬编码常量或变量的差异。
当您使用该常量时,优化器已知该值,因此它可以根据该值确定选择性(以及可能的索引使用情况)。当您使用变量时,优化器未知该值(因此它必须采用一些硬连线值或可能的密度信息)。因此,从技术上讲,这不是参数嗅探,但无论您找到有关该主题的任何文章,也应该解释常量和变量之间的区别。使用 OPTION(RECOMPILE) 实际上会将变量转变为参数嗅探情况。
本质上,常量、变量和参数(可以嗅探)之间有很大的区别。