SCO*_*SCO 2 postgresql group-by count
我正在使用Postgresql 9,当没有计算行数时,我正在计算和分组.
我们假设以下架构:
create table views {
date_event timestamp with time zone ;
event_id integer;
}
Run Code Online (Sandbox Code Playgroud)
让我们想象以下内容:
2012-01-01 00:00:05 2
2012-01-01 01:00:05 5
2012-01-01 03:00:05 8
2012-01-01 03:00:15 20
Run Code Online (Sandbox Code Playgroud)
我想按小时分组,并计算行数.我希望我能找回以下内容:
2012-01-01 00:00:00 1
2012-01-01 01:00:00 1
2012-01-01 02:00:00 0
2012-01-01 03:00:00 2
2012-01-01 04:00:00 0
2012-01-01 05:00:00 0
.
.
2012-01-07 23:00:00 0
Run Code Online (Sandbox Code Playgroud)
我的意思是,对于每个时间范围槽,我计算我的表中日期对应的行数,否则,我返回一个计数为零的行.
以下肯定不会起作用(只会将计数行数大于0的行).
SELECT extract ( hour from date_event ),count(*)
FROM views
where date_event > '2012-01-01' and date_event <'2012-01-07'
GROUP BY extract ( hour from date_event );
Run Code Online (Sandbox Code Playgroud)
请注意,我可能还需要按分钟,按小时,按天,按月或按年分组(当然可以进行多次查询).
我只能使用普通的旧sql,因为我的视图表可能非常大(> 100M记录),所以我会尽量记住性能.
怎么能实现这一目标?
谢谢 !
鉴于您没有表中的日期,您需要一种方法来生成它们.你可以使用这个generate_series功能:
SELECT * FROM generate_series('2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts;
Run Code Online (Sandbox Code Playgroud)
这将产生如下结果:
ts
---------------------
2012-01-01 00:00:00
2012-01-01 01:00:00
2012-01-01 02:00:00
2012-01-01 03:00:00
...
2012-01-07 21:00:00
2012-01-07 22:00:00
2012-01-07 23:00:00
(168 rows)
Run Code Online (Sandbox Code Playgroud)
剩下的任务是使用外连接加入这两个选择,如下所示:
select extract ( day from ts ) as day, extract ( hour from ts ) as hour,coalesce(count,0) as count from
(
SELECT extract ( day from date ) as day , extract ( hour from date ) as hr ,count(*)
FROM sr
where date>'2012-01-01' and date <'2012-01-07'
GROUP BY extract ( day from date ) , extract ( hour from date )
) AS cnt
right outer join ( SELECT * FROM generate_series ( '2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts ) as dtetable on extract ( hour from ts ) = cnt.hr and extract ( day from ts ) = cnt.day
order by day,hour asc;
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2103 次 |
| 最近记录: |