har*_*old 12 sql postgresql date count
我有一组电子邮件地址和日期,这些电子邮件地址已添加到表格中.对于不同的日期,可以有多个电子邮件地址条目.例如,如果我有下面的数据集.我希望得到我们在所述日期和3天前之间的不同电子邮件的日期和计数.
Date | email
-------+----------------
1/1/12 | test@test.com
1/1/12 | test1@test.com
1/1/12 | test2@test.com
1/2/12 | test1@test.com
1/2/12 | test2@test.com
1/3/12 | test@test.com
1/4/12 | test@test.com
1/5/12 | test@test.com
1/5/12 | test@test.com
1/6/12 | test@test.com
1/6/12 | test@test.com
1/6/12 | test1@test.com
Run Code Online (Sandbox Code Playgroud)
如果我们使用3的日期,结果集看起来会像这样
date | count(distinct email)
-------+------
1/1/12 | 3
1/2/12 | 3
1/3/12 | 3
1/4/12 | 3
1/5/12 | 2
1/6/12 | 2
Run Code Online (Sandbox Code Playgroud)
我可以使用下面的查询获得日期范围的明确计数,但是希望按天计算范围,这样我就不必手动更新数百个日期的范围.
select test.date, count(distinct test.email)
from test_table as test
where test.date between '2012-01-01' and '2012-05-08'
group by test.date;
Run Code Online (Sandbox Code Playgroud)
感谢帮助.
Erw*_*ter 12
测试用例:
CREATE TEMP TABLE tbl (day date, email text);
INSERT INTO tbl VALUES
('2012-01-01', 'test@test.com')
,('2012-01-01', 'test1@test.com')
,('2012-01-01', 'test2@test.com')
,('2012-01-02', 'test1@test.com')
,('2012-01-02', 'test2@test.com')
,('2012-01-03', 'test@test.com')
,('2012-01-04', 'test@test.com')
,('2012-01-05', 'test@test.com')
,('2012-01-05', 'test@test.com')
,('2012-01-06', 'test@test.com')
,('2012-01-06', 'test@test.com')
,('2012-01-06', 'test1@test.com`');
Run Code Online (Sandbox Code Playgroud)
查询 - 仅返回条目存在于的tbl日期:
SELECT day
,(SELECT count(DISTINCT email)
FROM tbl
WHERE day BETWEEN t.day - 2 AND t.day -- period of 3 days
) AS dist_emails
FROM tbl t
WHERE day BETWEEN '2012-01-01' AND '2012-01-06'
GROUP BY 1
ORDER BY 1;
Run Code Online (Sandbox Code Playgroud)
或者 - 返回指定范围内的所有日期,即使当天没有行:
SELECT day
,(SELECT count(DISTINCT email)
FROM tbl
WHERE day BETWEEN g.day - 2 AND g.day
) AS dist_emails
FROM (SELECT generate_series('2012-01-01'::date
, '2012-01-06'::date, '1d')::date) AS g(day)
Run Code Online (Sandbox Code Playgroud)
结果:
day | dist_emails
-----------+------------
2012-01-01 | 3
2012-01-02 | 3
2012-01-03 | 3
2012-01-04 | 3
2012-01-05 | 1
2012-01-06 | 2
Run Code Online (Sandbox Code Playgroud)
这听起来像是一个窗口函数的工作,但我没有找到一种方法来定义合适的窗口框架.另外,根据文件:
与普通聚合函数不同,聚合窗口函数不允许
DISTINCT或ORDER BY在函数参数列表中使用.
所以我用相关的子查询解决了它.我猜这是最聪明的方式.
我将您的日期列重命名为day,因为使用类型名称作为标识符是不好的做法.
顺便说一句,"在所述日期和3天前之间"将是4天的时间段.你的定义在那里是矛盾的.
有点短,但只有几天慢:
SELECT day, count(DISTINCT email) AS dist_emails
FROM (SELECT generate_series('2013-01-01'::date
, '2013-01-06'::date, '1d')::date) AS g(day)
LEFT JOIN tbl t ON t.day BETWEEN g.day - 2 AND g.day
GROUP BY 1
ORDER BY 1;
Run Code Online (Sandbox Code Playgroud)
在 SQL Server 中:
`select test.date, count(distinct test.email) from test_table as test where convert(date,test.date) between '2012-01-01' and '2012-05-08' group by test.date`
Run Code Online (Sandbox Code Playgroud)
希望这可以帮助。