正如标题所示,我想选择用a组成的每组行的第一行GROUP BY.
具体来说,如果我有一个purchases看起来像这样的表:
SELECT * FROM purchases;
Run Code Online (Sandbox Code Playgroud)
我的输出:
id | customer | total ---+----------+------ 1 | Joe | 5 2 | Sally | 3 3 | Joe | 2 4 | Sally | 1
我想查询每个产品id的最大购买量(total)customer.像这样的东西:
SELECT FIRST(id), customer, FIRST(total)
FROM purchases
GROUP BY customer
ORDER BY total DESC;
Run Code Online (Sandbox Code Playgroud)
预期产出:
FIRST(id) | customer | FIRST(total)
----------+----------+-------------
1 | Joe | 5
2 | Sally | 3
举个例子,我想按类别选择带有最大日期组的id,结果是:7,2,6
id category date
1 a 2013-01-01
2 b 2013-01-03
3 c 2013-01-02
4 a 2013-01-02
5 b 2013-01-02
6 c 2013-01-03
7 a 2013-01-03
8 b 2013-01-01
9 c 2013-01-01
Run Code Online (Sandbox Code Playgroud)
我可以在PostgreSQL中知道如何做到这一点吗?
我在Postgres 9.2中有下表(简化形式)
CREATE TABLE log (
log_date DATE,
user_id INTEGER,
payload INTEGER
);
Run Code Online (Sandbox Code Playgroud)
它每个用户和每天最多包含一条记录.每天将有大约500,000条记录,为期300天.每个用户的running_total总是在增加.
我想在特定日期之前有效地检索每个用户的最新记录.我的查询是:
SELECT user_id, max(log_date), max(payload)
FROM log
WHERE log_date <= :mydate
GROUP BY user_id
Run Code Online (Sandbox Code Playgroud)
这非常慢.我也尝试过:
SELECT DISTINCT ON(user_id), log_date, payload
FROM log
WHERE log_date <= :mydate
ORDER BY user_id, log_date DESC;
Run Code Online (Sandbox Code Playgroud)
具有相同的计划,同样缓慢.
到目前为止,我在user_msg_log(aggr_date)上有一个索引,但没有多大帮助.我应该用什么其他索引来加快速度,还是以任何其他方式实现我的目标?
sql postgresql indexing greatest-n-per-group postgresql-performance
在订购查询降序或升序时,您何时会首先想要NULLS?
在我看来,绝大多数时候,无论是升序还是降序,所期望的行为都是NULLS LAST.相反,我们必须指定NULLS FIRST.
我在表中有350万行acs_objects,我需要检索creation_date具有年份格式和不同的列.
我的第一次尝试:180~200 Sec (15 Rows Fetched)
SELECT DISTINCT to_char(creation_date,'YYYY') FROM acs_objects
Run Code Online (Sandbox Code Playgroud)
我的第二次尝试:35~40 Sec (15 Rows Fetched)
SELECT DISTINCT to_char(creation_date,'YYYY')
FROM (SELECT DISTINCT creation_date FROM acs_objects) AS distinct_date
Run Code Online (Sandbox Code Playgroud)
有没有办法让它更快? - "我需要在ADP网站上使用它"
我今天正在对一些慢速SQL查询进行故障排除,并且不太了解下面的性能差异:
当尝试max(timestamp)基于某些条件从数据表中提取时,使用MAX()比ORDER BY timestamp LIMIT 1匹配行存在时慢,但如果找不到匹配的行则相当快.
SELECT timestamp
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id = 4
ORDER BY timestamp DESC
LIMIT 1;
(0 rows)
Time: 1314.544 ms
SELECT timestamp
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id = 5
ORDER BY timestamp DESC
LIMIT 1;
(1 row)
Time: 10.890 ms
SELECT MAX(timestamp)
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id …Run Code Online (Sandbox Code Playgroud) 我在PostgreSQL 9.2 DB中有一个表,创建并填充如下:
CREATE TABLE foo( id integer, date date );
INSERT INTO foo
SELECT (id % 10) + 1, now() - (id % 50) * interval '1 day'
FROM generate_series(1, 100000) AS id;
Run Code Online (Sandbox Code Playgroud)
现在,我需要找到所有对(id, date),使得日期是具有相同的所有对中的最大值id.该查询是众所周知的,并且通常使用被调用的窗口函数ROW_NUMBER()
SELECT id, date
FROM (
SELECT id, date, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) rn
FROM foo
) sbt
WHERE sbt.rn = 1;
Run Code Online (Sandbox Code Playgroud)
现在,我要求该查询的计划,并发现该WindowAgg节点需要先对表进行排序.
Subquery Scan on sbt (cost=11116.32..14366.32 rows=500 width=8) (actual time=71.650..127.809 rows=10 …Run Code Online (Sandbox Code Playgroud) sql postgresql indexing greatest-n-per-group window-functions