修复在 Amazon Redshift 上计算 DAU 和 MAU 时的 MAU 问题

Pat*_*bug 3 sql amazon-redshift

根据这篇文章,我使用以下查询来计算 MAU 和 DAU :

WITH dau AS
(
  SELECT TRUNC(created_at) AS created_at,
         COUNT(DISTINCT member_id) AS dau
  FROM table ds
  WHERE ds.created_at BETWEEN '2018-09-03' AND '2018-09-08'
  GROUP BY TRUNC(created_at)
)
SELECT created_at,
       dau,
       (SELECT COUNT(DISTINCT member_id)
        FROM table ds
        WHERE ds.created_at BETWEEN created_at - 29*INTERVAL '1 day' AND created_at) AS mau
FROM dau
ORDER BY created_at
Run Code Online (Sandbox Code Playgroud)

我尝试运行此查询并得到以下结果:

2018-09-03  12844   3976132
2018-09-04  54236   3976132
2018-09-05  58631   3976132
2018-09-06  59786   3976132
2018-09-07  52317   3976132
2018-09-08  4   3976132
Run Code Online (Sandbox Code Playgroud)

可以清楚地看到 MAU 列有重复值。我该如何解决?任何指示都会有帮助。

Luk*_*zda 5

您应该为列名称添加前缀:

WITH dau AS
(
  SELECT TRUNC(created_at) AS created_at,
         COUNT(DISTINCT member_id) AS dau
  FROM table ds
  WHERE ds.created_at BETWEEN '2018-09-03' AND '2018-09-08'
  GROUP BY TRUNC(created_at)
)
SELECT created_at,
       dau,
       (SELECT COUNT(DISTINCT member_id)
        FROM table ds
        WHERE ds.created_at 
          BETWEEN dau.created_at - 29*INTERVAL '1 day' AND dau.created_at) AS mau
          -- here
FROM dau
ORDER BY created_at
Run Code Online (Sandbox Code Playgroud)

或者:

SELECT TRUNC(created_at) AS created_at,
     COUNT(DISTINCT member_id) AS dau,
     COUNT(DISTINCT member_id) 
     FILTER(WHERE TRUNC(created_at)>=TRUNC(created_at)-29*INTERVAL '1 day') AS mau
FROM table ds
WHERE ds.created_at BETWEEN '2018-09-03' AND '2018-09-08'
GROUP BY TRUNC(created_at)
ORDER BY created_at
Run Code Online (Sandbox Code Playgroud)