use*_*906 5 sql postgresql aggregate-functions duplicates window-functions
我有一张桌子author_data:
author_id | author_name
----------+----------------
9 | ernest jordan
14 | k moribe
15 | ernest jordan
25 | william h nailon
79 | howard jason
36 | k moribe
Run Code Online (Sandbox Code Playgroud)
现在我需要结果如下:
author_id | author_name
----------+----------------
9 | ernest jordan
15 | ernest jordan
14 | k moribe
36 | k moribe
Run Code Online (Sandbox Code Playgroud)
也就是说,我需要author_id具有重复外观的名称.我试过这句话:
select author_id,count(author_name)
from author_data
group by author_name
having count(author_name)>1
Run Code Online (Sandbox Code Playgroud)
但它不起作用.我怎么能得到这个?
我建议子查询中的窗口函数:
SELECT author_id, author_name -- omit the name here, if you just need ids
FROM (
SELECT author_id, author_name
, count(*) OVER (PARTITION BY author_name) AS ct
FROM author_data
) sub
WHERE ct > 1;
Run Code Online (Sandbox Code Playgroud)
您将识别基本的聚合函数count().它可以通过附加一个OVER子句变成一个窗口函数- 就像任何其他聚合函数一样.
这样,它计算每个分区的行数.瞧.
在没有窗口功能(v.8.3或更早版本)的旧版本中 - 或者通常 - 此替代版本执行速度非常快:
SELECT author_id, author_name -- omit name, if you just need ids
FROM author_data a
WHERE EXISTS (
SELECT 1
FROM author_data a2
WHERE a2.author_name = a.author_name
AND a2.author_id <> a.author_id
);
Run Code Online (Sandbox Code Playgroud)
如果您关注性能,请添加索引author_name.