我想找到每组中出现次数最多的值。
我尝试使用 top(k)(column) 但出现以下错误:列类不在聚合函数下且不在 GROUP BY 中。
例如:如果我有表 test_date 和 columns(pid, value)
pid, value
----------
1,a
1,b
1,a
1,c
Run Code Online (Sandbox Code Playgroud)
我想要结果:
pid, value
----------
1,a
Run Code Online (Sandbox Code Playgroud)
我试过SELECT pid,top(1)(value) top_value FROM test_data group by pid
I get the error:
Column value is not under aggregate function and not in GROUP BY
Run Code Online (Sandbox Code Playgroud)
我也尝试过,anyHeavy()
但它只适用于出现超过一半情况的值
此查询应该可以帮助您:
SELECT
pid,
/*
Decompose the query in parts:
1. groupArray((value, count)): convert the group of rows with the same 'pid' to the array of tuples (value, count)
2. arrayReverseSort: make reverse sorting by 'count' ('x.2' is 'count')
3. [1].1: take the 'value' from the first item of the sorted array
*/
arrayReverseSort(x -> x.2, groupArray((value, count)))[1].1 AS value
FROM
(
SELECT
pid,
value,
count() AS count
FROM test_date
GROUP BY
pid,
value
)
GROUP BY pid
ORDER BY pid ASC
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
1688 次 |
最近记录: |