正如标题所示,我想选择用a组成的每组行的第一行GROUP BY
.
具体来说,如果我有一个purchases
看起来像这样的表:
SELECT * FROM purchases;
Run Code Online (Sandbox Code Playgroud)
我的输出:
id | customer | total ---+----------+------ 1 | Joe | 5 2 | Sally | 3 3 | Joe | 2 4 | Sally | 1
我想查询每个产品id
的最大购买量(total
)customer
.像这样的东西:
SELECT FIRST(id), customer, FIRST(total)
FROM purchases
GROUP BY customer
ORDER BY total DESC;
Run Code Online (Sandbox Code Playgroud)
预期产出:
FIRST(id) | customer | FIRST(total) ----------+----------+------------- 1 | Joe | 5 2 | Sally | 3
我在Postgres 9.2中有下表(简化形式)
CREATE TABLE log (
log_date DATE,
user_id INTEGER,
payload INTEGER
);
Run Code Online (Sandbox Code Playgroud)
它每个用户和每天最多包含一条记录.每天将有大约500,000条记录,为期300天.每个用户的running_total总是在增加.
我想在特定日期之前有效地检索每个用户的最新记录.我的查询是:
SELECT user_id, max(log_date), max(payload)
FROM log
WHERE log_date <= :mydate
GROUP BY user_id
Run Code Online (Sandbox Code Playgroud)
这非常慢.我也尝试过:
SELECT DISTINCT ON(user_id), log_date, payload
FROM log
WHERE log_date <= :mydate
ORDER BY user_id, log_date DESC;
Run Code Online (Sandbox Code Playgroud)
具有相同的计划,同样缓慢.
到目前为止,我在user_msg_log(aggr_date)上有一个索引,但没有多大帮助.我应该用什么其他索引来加快速度,还是以任何其他方式实现我的目标?
sql postgresql indexing greatest-n-per-group postgresql-performance
有两个表conversations
和messages
,我想获取对话及其最新消息的内容。
conversations
- id(主键)、名称、创建时间
messages
- id、内容、created_at、conversation_id
目前我们正在运行此查询来获取所需的数据
SELECT
conversations.id,
m.content AS last_message_content,
m.created_at AS last_message_at
FROM
conversations
INNER JOIN messages m ON conversations.id = m.conversation_id
AND m.id = (
SELECT
id
FROM
messages _m
WHERE
m.conversation_id = _m.conversation_id
ORDER BY
created_at DESC
LIMIT 1)
ORDER BY
last_message_at DESC
LIMIT 15
OFFSET 0
Run Code Online (Sandbox Code Playgroud)
上面的查询返回有效数据,但其性能随着行数的增加而降低。有没有其他方法可以提高性能来编写此查询?例如附加小提琴。
http://sqlfiddle.com/#!17/2decb/2
还尝试了已删除答案之一中的建议:
SELECT DISTINCT ON (c.id)
c.id,
m.content AS last_message_content,
m.created_at AS last_message_at
FROM conversations AS c
INNER JOIN messages AS m …
Run Code Online (Sandbox Code Playgroud) sql postgresql greatest-n-per-group postgresql-performance postgresql-13
我在表中有350万行acs_objects
,我需要检索creation_date
具有年份格式和不同的列.
我的第一次尝试:180~200 Sec (15 Rows Fetched)
SELECT DISTINCT to_char(creation_date,'YYYY') FROM acs_objects
Run Code Online (Sandbox Code Playgroud)
我的第二次尝试:35~40 Sec (15 Rows Fetched)
SELECT DISTINCT to_char(creation_date,'YYYY')
FROM (SELECT DISTINCT creation_date FROM acs_objects) AS distinct_date
Run Code Online (Sandbox Code Playgroud)
有没有办法让它更快? - "我需要在ADP网站上使用它"
我有一个包含数亿行的表,我想从同一个表的 2 个索引列中获取唯一值的单个列表(没有唯一的行 ID)。
为了说明这一点,假设我们有一个包含一fruits
列和一veggies
列的表,我想构建一个healthy_foods
包含两列中唯一值的列表。
我尝试过以下查询:
与联盟
WITH cte as (
SELECT fruit, veggie
FROM recipes
)
SELECT fruit as healthy_food
FROM cte
UNION -- <---
SELECT veggie as healthy_food
FROM cte;
Run Code Online (Sandbox Code Playgroud)
与 UNION ALL 然后 DISTINCT ON
WITH cte as (...)
SELECT DISTINCT ON (healthy_food) healthy_food FROM -- <---
(SELECT fruit as healthy_food
FROM cte
UNION ALL -- <---
SELECT veggie as healthy_food
FROM cte) tb;
Run Code Online (Sandbox Code Playgroud)
与 UNION …