Ben*_*nni 2 sql amazon-redshift
我想在表格中为每组选择 X 对最常见的对。让我们考虑下表:
+-------------+-----------+
| identifier | city |
+-------------+-----------+
| AB | Seattle |
| AC | Seattle |
| AC | Seattle |
| AB | Seattle |
| AD | Seattle |
| AB | Chicago |
| AB | Chicago |
| AD | Chicago |
| AD | Chicago |
| BC | Chicago |
+-------------+-----------+
Run Code Online (Sandbox Code Playgroud)
如果我想为每个城市选择 2 个最常见的,结果应该是:
+-------------+-----------+
| identifier | city |
+-------------+-----------+
| AB | Seattle |
| AC | Seattle |
| AB | Chicago |
| AD | Chicago |
+-------------+-----------+
Run Code Online (Sandbox Code Playgroud)
任何帮助表示赞赏。谢谢,本尼
您可以使用countin row number 对每个城市组合的出现次数进行排序并选择前两个。
select city,identifier
from (
select city,identifier
,row_number() over(partition by city order by count(*) desc,identifier) as rnum_cnt
from tbl
group by city,identifier
) t
where rnum_cnt<=2
Run Code Online (Sandbox Code Playgroud)