高级SQL选择查询

JJ *_*Liu 7 mysql sql select join

week      cookie
1         a
1         b
1         c
1         d
2         a 
2         b
3         a
3         c
3         d
Run Code Online (Sandbox Code Playgroud)

此表表示某人在特定周内访问某个网站. 每个cookie代表一个人.每个条目代表某人在特定的一周内访问此网站.例如,最后一个条目意味着'd'在第3周来到网站.

我想知道有多少(相同的)人在接下来的一周内继续回来,这是一个开始的一周.

例如,如果我查看第1周,我会得到如下结果:

1 | 4
2 | 2
3 | 1
Run Code Online (Sandbox Code Playgroud)

因为第1周有4位用户进来.第2周只有2位用户(a,b)回来了.这3周内只有1位(a)进入了他们.

如何查找选择查询?表格很大:可能有100周,所以我想找到正确的方法.

Boh*_*ian 3

此查询使用变量来跟踪相邻的周并计算它们是否连续:

set @start_week = 2, @week := 0, @conseq := 0, @cookie:='';
select conseq_weeks, count(*)
from (
select 
  cookie,
  if (cookie != @cookie or week != @week + 1, @conseq := 0, @conseq := @conseq + 1) + 1 as conseq_weeks,
  (cookie != @cookie and week <= @start_week) or (cookie = @cookie and week = @week + 1) as conseq,
  @cookie := cookie as lastcookie,
  @week := week as lastweek
from (select week, cookie from webhist where week >= @start_week order by 2, 1) x
) y
where conseq
group by 1;
Run Code Online (Sandbox Code Playgroud)

这是第二周的情况。对于另一周,更改start_week顶部的变量。

这是测试:

create table webhist(week int, cookie char);
insert into webhist values (1, 'a'), (1, 'b'), (1, 'c'), (1, 'd'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'c'), (3, 'd');
Run Code Online (Sandbox Code Playgroud)

上述查询的输出where week >= 1

+--------------+----------+
| conseq_weeks | count(*) |
+--------------+----------+
|            1 |        4 |
|            2 |        2 |
|            3 |        1 |
+--------------+----------+
Run Code Online (Sandbox Code Playgroud)

上述查询的输出where week >= 2

+--------------+----------+
| conseq_weeks | count(*) |
+--------------+----------+
|            1 |        2 |
|            2 |        1 |
+--------------+----------+
Run Code Online (Sandbox Code Playgroud)

ps 好问题,但有点破坏球