JJ *_*Liu 7 mysql sql select join
week cookie
1 a
1 b
1 c
1 d
2 a
2 b
3 a
3 c
3 d
Run Code Online (Sandbox Code Playgroud)
此表表示某人在特定周内访问某个网站. 每个cookie代表一个人.每个条目代表某人在特定的一周内访问此网站.例如,最后一个条目意味着'd'在第3周来到网站.
我想知道有多少(相同的)人在接下来的一周内继续回来,这是一个开始的一周.
例如,如果我查看第1周,我会得到如下结果:
1 | 4
2 | 2
3 | 1
Run Code Online (Sandbox Code Playgroud)
因为第1周有4位用户进来.第2周只有2位用户(a,b)回来了.这3周内只有1位(a)进入了他们.
如何查找选择查询?表格很大:可能有100周,所以我想找到正确的方法.
此查询使用变量来跟踪相邻的周并计算它们是否连续:
set @start_week = 2, @week := 0, @conseq := 0, @cookie:='';
select conseq_weeks, count(*)
from (
select
cookie,
if (cookie != @cookie or week != @week + 1, @conseq := 0, @conseq := @conseq + 1) + 1 as conseq_weeks,
(cookie != @cookie and week <= @start_week) or (cookie = @cookie and week = @week + 1) as conseq,
@cookie := cookie as lastcookie,
@week := week as lastweek
from (select week, cookie from webhist where week >= @start_week order by 2, 1) x
) y
where conseq
group by 1;
Run Code Online (Sandbox Code Playgroud)
这是第二周的情况。对于另一周,更改start_week顶部的变量。
这是测试:
create table webhist(week int, cookie char);
insert into webhist values (1, 'a'), (1, 'b'), (1, 'c'), (1, 'd'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'c'), (3, 'd');
Run Code Online (Sandbox Code Playgroud)
上述查询的输出where week >= 1:
+--------------+----------+
| conseq_weeks | count(*) |
+--------------+----------+
| 1 | 4 |
| 2 | 2 |
| 3 | 1 |
+--------------+----------+
Run Code Online (Sandbox Code Playgroud)
上述查询的输出where week >= 2:
+--------------+----------+
| conseq_weeks | count(*) |
+--------------+----------+
| 1 | 2 |
| 2 | 1 |
+--------------+----------+
Run Code Online (Sandbox Code Playgroud)
ps 好问题,但有点破坏球