移动N天活跃用户(BigQuery)

Dav*_*ith 3 google-bigquery

我有一个表"事件"由2列组成:

userId | eventDate
-------+-------------------
s234124| 2015-01-01
a2s3166| 2015-01-02
c216782| 2015-01-03
z312235| 2015-01-04
Run Code Online (Sandbox Code Playgroud)

userId是用户标识.eventDate表示该用户发生事件的日期.

我想每天计算在该日期结束的30(或7或60等)日期间的活动唯一身份用户数.活动的唯一用户被定义为在给定窗口期间至少具有一个事件的userId.

我读过这篇文章,它描述了一个类似的问题,但我很难适应我的用例.

Mik*_*ant 6

假设您的表中有两个filed useriddate - dataset.your_table

SELECT 
  date,
  SUM(CASE WHEN period = 7  THEN users END) as days_07,
  SUM(CASE WHEN period = 14 THEN users END) as days_14,
  SUM(CASE WHEN period = 30 THEN users END) as days_30
FROM (
  SELECT
    dates.date as date,  
    periods.period as period,
    EXACT_COUNT_DISTINCT(activity.userid) as users
  FROM dataset.your_table as activity
  CROSS JOIN (SELECT date FROM dataset.your_table GROUP BY date) as dates
  CROSS JOIN (SELECT period FROM (SELECT 7 as period), 
                (SELECT 14 as period), (SELECT 30 as period)) as periods
  WHERE dates.date >= activity.date 
  AND INTEGER(FLOOR(DATEDIFF(dates.date, activity.date)/periods.period)) = 0
  GROUP BY 1,2
)
GROUP BY date
ORDER BY date DESC
Run Code Online (Sandbox Code Playgroud)

结果如下所示

date           days_07     days_14     days_30
8/29/2015    2,468,649   3,597,684   7,180,175 
8/28/2015    2,472,342   3,592,680   6,969,581 
8/27/2015    2,486,979   3,595,822   6,745,625 
8/26/2015    2,507,572   3,576,816   6,494,710 
8/25/2015    2,508,036   3,553,386   6,264,950 
8/24/2015    2,511,946   3,521,184   6,024,151 
8/23/2015    2,488,485   3,482,163   5,774,763 
8/22/2015    2,474,526   3,450,719   5,547,318 
8/21/2015    2,463,568   3,422,003   5,327,760 
Run Code Online (Sandbox Code Playgroud)