Tom*_*ter 3 sql group-by max amazon-redshift
我正在做红移 - 我有一张桌子
userid oid version number_of_objects
1 ab 1 10
1 ab 2 20
1 ab 3 17
1 ab 4 16
1 ab 5 14
1 cd 1 5
1 cd 2 6
1 cd 3 9
1 cd 4 12
2 ef 1 4
2 ef 2 3
2 gh 1 16
2 gh 2 12
2 gh 3 21
Run Code Online (Sandbox Code Playgroud)
我想从这个表中选择每个的最大版本号,oid并获取userid行的编号.
当我尝试这个时,遗憾的是我已经把整张桌子拿回来了:
SELECT MAX(version), oid, userid, number_of_objects
FROM table
GROUP BY oid, userid, number_of_objects
LIMIT 10;
Run Code Online (Sandbox Code Playgroud)
但真正的结果,我正在寻找的是:
userid oid MAX(version) number_of_objects
1 ab 5 14
1 cd 4 12
2 ef 2 3
2 gh 3 21
Run Code Online (Sandbox Code Playgroud)
不知何故,它也不起作用,它说:
不支持SELECT DISTINCT ON
你有什么主意吗?
更新:在此期间我想出了这个解决方法,但我觉得这不是最聪明的解决方案.它也很慢.但它至少起作用.以防万一:
SELECT * FROM table,
(SELECT MAX(version) as maxversion, oid, userid
FROM table
GROUP BY oid, userid
) as maxtable
WHERE table.oid = maxtable.oid
AND table.userid = maxtable.userid
AND table.version = maxtable.version
LIMIT 100;
Run Code Online (Sandbox Code Playgroud)
你有更好的解决方案吗?
如果redshift确实有窗口函数,你可以试试这个:
SELECT *
FROM (
select oid,
userid,
version,
max(version) over (partition by oid, userid) as max_version,
from the_table
) t
where version = max_version;
Run Code Online (Sandbox Code Playgroud)
我希望它比一个自我加入更快group by.
另一种选择是使用该row_number()功能:
SELECT *
FROM (
select oid,
userid,
version,
row_number() over (partition by oid, userid order by version desc) as rn,
from the_table
) t
where rn = 1;
Run Code Online (Sandbox Code Playgroud)
这更像是个人品味的问题.表现明智,我不希望有任何区别.