Por*_*Kev 5 sql google-bigquery
我正在尝试在 BigQuery 中执行类似的操作
COUNT(DISTINCT user_id) OVER (PARTITION BY DATE_TRUNC(date, month), sample, app_id ORDER BY DATE RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as ACTIVE_USERS
换句话说,我有一个包含日期、用户 ID、样本和应用程序 ID 的表。我需要计算从月初到当天结束的每一天的累计唯一活跃用户数。
该函数可以正常工作,没有明显的差异,但是,这给了我用户总数,但这不是我需要的。
使用dense_rank尝试了一些技巧,但它在这里不起作用。
有没有什么方法可以使用窗口函数计算不同用户的数量?
-------------更新---------------- 这是完整的查询,这样你就可以更好地理解我需要什么
with mtd1 as (select
'MonthToDate' as TIMELINE
,fd.date DATE
,td.SAMPLE as SAMPLE
,td.APPNAME as APP_ID
,sum(fd.revenue) as REVENUE
,td.user_id ACTIVE_USERS
from DWH.DailyUser fd
join DWH.Depositors td using (userid)
group by 1,2,3,4,6
),
mtd as (
select TIMELINE
,DATE
,SAMPLE
,APP_ID
,sum(revenue) over (partition by date_trunc(date, month), sample, app_id order by date range BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as REVENUE
,COUNT(distinct active_users) over (partition by date_trunc(date, month), sample, app_id order by date range BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as ACTIVE_USERS
from mtd1
)
select * from mtd
where extract(day from date) = extract(day from current_date)
group by 1,2,3,4,5,6
Run Code Online (Sandbox Code Playgroud)
窗口函数不同。BigQuery - 有没有什么方法可以使用窗口函数计算不同用户的数量?
这个具体问题是重复的并且已经回答here
...这是完整的查询...
至于如何将上述应用于您的特定查询 - 请参阅下文(未经测试并完全基于您的代码
#standardSQL
WITH mtd1 AS (
SELECT
'MonthToDate' AS TIMELINE
,fd.date DATE
,td.SAMPLE AS SAMPLE
,td.APPNAME AS APP_ID
,SUM(fd.revenue) AS REVENUE
,td.user_id ACTIVE_USERS
FROM `DWH.DailyUser` fd
JOIN `DWH.Depositors` td USING (userid)
GROUP BY 1,2,3,4,6
), mtd2 AS (
SELECT
TIMELINE
,DATE
,SAMPLE
,APP_ID
,SUM(REVENUE) OVER (PARTITION BY DATE_TRUNC(DATE, MONTH), SAMPLE, APP_ID ORDER BY DATE RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS REVENUE
,ARRAY_AGG(ACTIVE_USERS) OVER (PARTITION BY DATE_TRUNC(DATE, MONTH), SAMPLE, APP_ID ORDER BY DATE RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ACTIVE_USERS
FROM mtd1
), mtd AS (
SELECT * REPLACE((SELECT COUNT(DISTINCT u) FROM UNNEST(ACTIVE_USERS) AS u) AS ACTIVE_USERS)
FROM mtd2
)
SELECT * FROM mtd
WHERE EXTRACT(day FROM DATE) = EXTRACT(day FROM CURRENT_DATE)
GROUP BY 1,2,3,4,5,6
Run Code Online (Sandbox Code Playgroud)