jan*_*cki 5 greatest-n-per-group clickhouse
在 ClickHouse 中按组查询前 N 行的正确方法是什么?
让我们以具有 id2、id4、v3 列且 N=2 的 tbl 为例。我尝试了以下方法
SELECT
id2,
id4,
v3 AS v3
FROM tbl
GROUP BY
id2,
id4
ORDER BY v3 DESC
LIMIT 2 BY
id2,
id4
Run Code Online (Sandbox Code Playgroud)
但出现错误
Received exception from server (version 19.3.4):
Code: 215. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception
: Column v3 is not under aggregate function and not in GROUP BY..
Run Code Online (Sandbox Code Playgroud)
我可以放入v3
GROUP BY ,它似乎确实有效,但按指标分组效率不高。
有any
聚合函数,但我们实际上想要all
值(通过 LIMIT BY 子句限制为 2)而不是any
值,所以这听起来不是正确的解决方案。
Received exception from server (version 19.3.4):
Code: 215. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception
: Column v3 is not under aggregate function and not in GROUP BY..
Run Code Online (Sandbox Code Playgroud)
它可以像这样使用聚合函数:
SELECT
id2,
id4,
arrayJoin(arraySlice(arrayReverseSort(groupArray(v3)), 1, 2)) v3
FROM tbl
GROUP BY
id2,
id4
Run Code Online (Sandbox Code Playgroud)