表间隙分析的性能调优

wes*_*eve 6 sql-server-2005 sql-server gaps-and-islands

我有一个表,用于存储从现场设备接收到的数据序列(计数器)。无论如何,这些序列需要在可配置的时间跨度内有序,但可以无序进入系统。如果设备被重置,那么它的序列号将被设置回 0。

CREATE TABLE [dbo].[TapGapDetail]
(
[Id] [bigint] NOT NULL, --FK to another table
[DeviceESN] [varchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[TapDateUTC] [datetime] NOT NULL, --date event occurred on device
[CreatedDateUTC] [datetime] NOT NULL,
[Counter] [int] NOT NULL,
)

CREATE CLUSTERED INDEX [CX_TapGapDetail] ON [dbo].[TapGapDetail] ([DeviceESN], [CreatedDateUTC], [Counter]) ON [PRIMARY]
GO
Run Code Online (Sandbox Code Playgroud)

我尝试通过 DeviceESN 和 CreatedDateUTC 切换 CX 的顺序,但在我的系统上,它似乎对 IO 没有太大影响。该表中有数百万行。

示例插入将是:

INSERT INTO TapGapDetail (1, 'A', '1/1/2012 01:00AM', '1/1/2012 01:10AM', 5)
INSERT INTO TapGapDetail (2, 'A', '1/1/2012 12:05AM', '1/1/2012 01:15AM', 4) --out of order by insert date
INSERT INTO TapGapDetail (3, 'A', '1/2/2012 12:00AM', '1/2/2012 12:05AM', 6) --back in order
INSERT INTO TapGapDetail (4, 'A', '1/3/2012 01:00AM', '1/3/2012 01:05AM', 8) --missing 7
INSERT INTO TapGapDetail (5, 'A', '1/3/2012 06:00AM', '1/3/2012 06:05AM', 9) --in order outside 'tolerance'
INSERT INTO TapGapDetail (6, 'A', '1/10/2012 06:00AM', '1/10/2012 06:05AM', 0) --device reset
Run Code Online (Sandbox Code Playgroud)

报告需要报告此数据发生了一次差距。还有一个细节,但一次一个步骤。我想让它在不到 10 秒的时间内运行,但在我的本地似乎至少需要 1.25 分钟。如果需要,我有 stat IO。所以,我为获得“差距”而创建的过程是这样的:

CREATE PROC [dbo].[GetValidatorTapGapSummary]
    @Validator varchar(100) = ''
    ,@FilterDateUtc datetime
    ,@ToleranceHours int
AS

--temp table
SELECT 
RowID = ROW_NUMBER() OVER (ORDER BY DeviceESN, CreatedDateUTC, [Counter]),  
DeviceESN, 
TapDateUTC,
CreatedDateUTC,
[Counter]
INTO #taps
FROM TapGapDetail
WHERE CreatedDateUTC >= @FilterDateUtc
Order By 1

--cx
CREATE CLUSTERED INDEX CX1 ON #taps(RowID)

--results
select  
    t.DeviceESN as Validator
    ,sum(
        case 
            --They are in sequence
            when t2.[Counter] = t.[Counter]+ 1 then 0 
            --A reset has occured
            when t2.[Counter] < t.[Counter] then 0 
            --A gap exists. Find the difference
            else (t2.[Counter] - t.[Counter] - 1) 
        end) 
        as TapGaps
    ,case 
        --gets the last tap date per validator
        when MAX(t.TapDateUTC) > MAX(t2.TapDateUTC) THEN MAX(T.TapDateUTC) 
        ELSE MAX(t2.TapDateUTC) 
    END 
    AS MaxTapDate
    ,case 
        --gets the last tap date per validator
        when MAX(t.CreatedDateUTC) > MAX(t2.CreatedDateUTC) THEN MAX(T.CreatedDateUTC) 
        ELSE MAX(t2.CreatedDateUTC) 
    END 
    AS MaxCreatedDate
from #Taps t 
    inner join #Taps t2 on t.DeviceESN = t2.DeviceESN and t2.RowID = t.RowID + 1
where t2.[Counter] != t.[Counter] + 1
 and t2.[Counter] > t.[Counter]
 And t.CreatedDateUTC >= @FilterDateUtc
 And t2.CreatedDateUTC >= @FilterDateUtc
 And (t.DeviceESN = @Validator Or @Validator = '')
 And Not Exists --edge case for when there is a gap at the end, tried with left join and stats are the same, so this is easier to read I think.
        (Select Top 1 Null From TapGapDetail tgd Where tgd.DeviceEsn = t.DeviceESN And (t.Counter + 1) = tgd.Counter 
            And tgd.CreatedDateUTC >= DateAdd(day, -1 * @ToleranceHours, t.CreatedDateUTC)
            And tgd.CreatedDateUTC <= DateAdd(day, @ToleranceHours, t.CreatedDateUTC))
group by t.DeviceESN
Order By MaxCreatedDate Desc, t.DeviceEsn

GO
Run Code Online (Sandbox Code Playgroud)

一个例子 STATIO 是:

Table 'TapGapDetail'. Scan count 5, logical reads 24060, physical reads 400, read-ahead reads 4590, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(2763981 row(s) affected)

(1 row(s) affected)
Table '#taps_______________________________________________________________________________________________________________000000000051'. Scan count 1, logical reads 18246, physical reads 0, read-ahead reads 251, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(154 row(s) affected)
Table '#taps_______________________________________________________________________________________________________________000000000051'. Scan count 10, logical reads 37834, physical reads 0, read-ahead reads 1240, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TapGapDetail'. Scan count 5, logical reads 13563, physical reads 3, read-ahead reads 13445, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Run Code Online (Sandbox Code Playgroud)

我尝试了各种索引以及临时表上没有的索引,但我怀疑即使只是填充也需要太长时间。任何建议表示赞赏。

Rob*_*ley 5

使用不同的方法。

首先,不要用 270 万行填充临时表 - 这不会在 10 秒内返回。您可以改用 CTE,这样效果会更好。

WITH taps as (
SELECT 
RowID = ROW_NUMBER() OVER (ORDER BY DeviceESN, CreatedDateUTC, [Counter]),  
DeviceESN, 
TapDateUTC,
CreatedDateUTC,
[Counter]
FROM TapGapDetail
WHERE CreatedDateUTC >= @FilterDateUtc
)

--results
select  
    t.DeviceESN as Validator
    ,sum(
        case 
            --They are in sequence
            when t2.[Counter] = t.[Counter]+ 1 then 0 
            --A reset has occured
            when t2.[Counter] < t.[Counter] then 0 
            --A gap exists. Find the difference
            else (t2.[Counter] - t.[Counter] - 1) 
        end) 
        as TapGaps
    ,case 
        --gets the last tap date per validator
        when MAX(t.TapDateUTC) > MAX(t2.TapDateUTC) THEN MAX(T.TapDateUTC) 
        ELSE MAX(t2.TapDateUTC) 
    END 
    AS MaxTapDate
    ,case 
        --gets the last tap date per validator
        when MAX(t.CreatedDateUTC) > MAX(t2.CreatedDateUTC) THEN MAX(T.CreatedDateUTC) 
        ELSE MAX(t2.CreatedDateUTC) 
    END 
    AS MaxCreatedDate
from taps t 
    inner join taps t2 on t.DeviceESN = t2.DeviceESN and t2.RowID = t.RowID + 1
where t2.[Counter] != t.[Counter] + 1
 and t2.[Counter] > t.[Counter]
 And t.CreatedDateUTC >= @FilterDateUtc
 And t2.CreatedDateUTC >= @FilterDateUtc
 And (t.DeviceESN = @Validator Or @Validator = '')
 And Not Exists --edge case for when there is a gap at the end, tried with left join and stats are the same, so this is easier to read I think.
        (Select Top 1 Null From TapGapDetail tgd Where tgd.DeviceEsn = t.DeviceESN And (t.Counter + 1) = tgd.Counter 
            And tgd.CreatedDateUTC >= DateAdd(day, -1 * @ToleranceHours, t.CreatedDateUTC)
            And tgd.CreatedDateUTC <= DateAdd(day, @ToleranceHours, t.CreatedDateUTC))
group by t.DeviceESN
Order By MaxCreatedDate Desc, t.DeviceEsn
Run Code Online (Sandbox Code Playgroud)

我并不是说这一定很棒(“NOT EXISTS”位可能会很痛苦),但几乎可以肯定这是一种改进。