在滚动时间内计算唯一ID

Eri*_*ikK 5 ansi-sql google-bigquery

我有一个简单的表格,下面有很多ID和日期。

ID      Date
10R46   2014-11-23  
10R46   2016-04-11  
100R9   2016-12-21
10R91   2013-05-03 
...     ...
Run Code Online (Sandbox Code Playgroud)

我想制定一个查询,该查询针对日期(例如十天)的滚动时间范围计算唯一ID。这意味着对于每个日期,它应该给我该日期与10天后之间的唯一ID数。结果应如下所示。

UniqueTenDays    Date
200              2014-11-23 
324              2014-11-24 
522              2014-11-25
532              2014-11-26 
...              ...
Run Code Online (Sandbox Code Playgroud)

如下所示,但是我意识到我需要应用WHERE子句并以某种方式计算每个Date的ID。

SELECT Date, COUNT(DISTINCT ID)
FROM T 
WHERE Date BETWEEN DATE_SUB(Date, INTERVAL 10 DAY) AND Date
GROUP BY Date
ORDER BY Date
Run Code Online (Sandbox Code Playgroud)

提前致谢。

Mik*_*ant 6

下面是 BigQuery 标准 SQL

#standardSQL
WITH temp1 AS (
  SELECT dt, STRING_AGG(DISTINCT id) AS users
  FROM `project.dataset.yourtable`
  GROUP BY dt
), temp2 AS (
  SELECT
    dt, 
    STRING_AGG(users) OVER(ORDER BY UNIX_DATE(dt) RANGE BETWEEN 10 PRECEDING AND CURRENT ROW) users
  FROM temp1
)
SELECT dt, 
  (SELECT COUNT(DISTINCT id) FROM UNNEST(SPLIT(users)) AS id) UniqueTenDays
FROM temp2
Run Code Online (Sandbox Code Playgroud)

您可以使用虚拟数据测试/玩它,如下所示

#standardSQL
WITH `project.dataset.yourtable` AS (
  SELECT '10R46' id,  DATE '2014-11-23' dt UNION ALL  
  SELECT '10R46',     DATE '2016-04-11' UNION ALL  
  SELECT '10R46',     DATE '2016-04-12' UNION ALL  
  SELECT '10R47',     DATE '2016-04-13' UNION ALL  
  SELECT '10R48',     DATE '2016-04-14' UNION ALL  
  SELECT '100R9',     DATE '2016-12-21' UNION ALL
  SELECT '10R91',     DATE '2013-05-03'
), temp1 AS (
  SELECT dt, STRING_AGG(DISTINCT id) AS users
  FROM `project.dataset.yourtable`
  GROUP BY dt
), temp2 AS (
  SELECT
    dt, 
    STRING_AGG(users) OVER(ORDER BY UNIX_DATE(dt) RANGE BETWEEN 10 PRECEDING AND CURRENT ROW) users
  FROM temp1
)
SELECT dt,  
  (SELECT COUNT(DISTINCT id) FROM UNNEST(SPLIT(users)) AS id) UniqueTenDays
FROM temp2
Run Code Online (Sandbox Code Playgroud)