如何在SQL Server中的滑动窗口上聚合(计算不同的项目)?

Roc*_*nce 11 sql sql-server count aggregate-functions sliding-window

我目前正在使用此查询(在SQL Server中)每天计算唯一项目的数量:

SELECT Date, COUNT(DISTINCT item) 
FROM myTable 
GROUP BY Date 
ORDER BY Date
Run Code Online (Sandbox Code Playgroud)

如何对此进行转换以获取过去3天(包括当天)中每个日期的唯一商品数量

输出应该是一个包含2列的表:一列包含原始表中的所有日期.在第二列,我们有每个日期的唯一项目数.

例如,如果原始表是:

Date        Item  
01/01/2018  A  
01/01/2018  B  
02/01/2018  C  
03/01/2018  C    
04/01/2018  C
Run Code Online (Sandbox Code Playgroud)

根据我上面的查询,我目前获得每天的唯一计数:

Date        count  
01/01/2018  2  
02/01/2018  1  
03/01/2018  1  
04/01/2018  1
Run Code Online (Sandbox Code Playgroud)

我希望得到3天滚动窗口的独特计数:

Date        count  
01/01/2018  2  
02/01/2018  3  (because items ABC on 1st and 2nd Jan)
03/01/2018  3  (because items ABC on 1st,2nd,3rd Jan)    
04/01/2018  1  (because only item C on 2nd,3rd,4th Jan)    
Run Code Online (Sandbox Code Playgroud)

Use*_*ady 7

使用a apply提供了一种形成滑动窗口的便捷方式

CREATE TABLE myTable 
    ([DateCol] datetime, [Item] varchar(1))
;

INSERT INTO myTable 
    ([DateCol], [Item])
VALUES
    ('2018-01-01 00:00:00', 'A'),
    ('2018-01-01 00:00:00', 'B'),
    ('2018-01-02 00:00:00', 'C'),
    ('2018-01-03 00:00:00', 'C'),
    ('2018-01-04 00:00:00', 'C')
;

CREATE NONCLUSTERED INDEX IX_DateCol  
    ON MyTable([Date])  
;    
Run Code Online (Sandbox Code Playgroud)

查询:

select distinct 
       t1.dateCol
     , oa.ItemCount
from myTable t1
outer apply (
      select count(distinct t2.item) as ItemCount
      from myTable t2
      where t2.DateCol between dateadd(day,-2,t1.DateCol) and t1.DateCol
  ) oa
order by t1.dateCol ASC
Run Code Online (Sandbox Code Playgroud)

结果:

|              dateCol | ItemCount |
|----------------------|-----------|
| 2018-01-01T00:00:00Z |         2 |
| 2018-01-02T00:00:00Z |         3 |
| 2018-01-03T00:00:00Z |         3 |
| 2018-01-04T00:00:00Z |         1 |
Run Code Online (Sandbox Code Playgroud)

通过date在使用之前减少列可能会有一些性能提升apply,如下所示:

select 
       d.date
     , oa.ItemCount
from (
    select distinct t1.date
    from myTable t1
     ) d
outer apply (
      select count(distinct t2.item) as ItemCount
      from myTable t2
      where t2.Date between dateadd(day,-2,d.Date) and d.Date
  ) oa
order by d.date ASC
;
Run Code Online (Sandbox Code Playgroud)

而不是使用select distinct在子查询,你可以使用group by替代,但执行计划将保持不变.

在SQL Fiddle演示


Sal*_*n A 5

最直接的解决方案是根据日期将表与自身连接起来:

SELECT t1.DateCol, COUNT(DISTINCT t2.Item) AS C
FROM testdata AS t1 
LEFT JOIN testdata AS t2 ON t2.DateCol BETWEEN DATEADD(dd, -2, t1.DateCol) AND t1.DateCol
GROUP BY t1.DateCol
ORDER BY t1.DateCol
Run Code Online (Sandbox Code Playgroud)

输出:

| DateCol                 | C |
|-------------------------|---|
| 2018-01-01 00:00:00.000 | 2 |
| 2018-01-02 00:00:00.000 | 3 |
| 2018-01-03 00:00:00.000 | 3 |
| 2018-01-04 00:00:00.000 | 1 |
Run Code Online (Sandbox Code Playgroud)


Jua*_*eza 1

使用GETDATE()函数获取当前日期,并DATEADD()获取最近 3 天

 SELECT Date, count(DISTINCT item) 
 FROM myTable 
 WHERE [Date] >= DATEADD(day,-3, GETDATE())
 GROUP BY Date 
 ORDER BY Date
Run Code Online (Sandbox Code Playgroud)