我有一张桌子:t
我的目标:仅提取表中得分最高的“id”,并按week_number 对其进行分组。
询问:
SELECT id,
CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING) AS week_number,
MAX(score) AS highest_score
FROM t
WHERE body='r/twinpeaks'
GROUP BY id;
Run Code Online (Sandbox Code Playgroud)
我收到此错误:错误:SELECT 列表表达式引用列 created_utc,它在 [2:49] 处既未分组也未聚合
我尝试这样做:
SELECT id,
CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING) AS week_number,
MAX(score) AS highest_score
FROM t
WHERE body='r/twinpeaks'
GROUP BY week_number, id;
Run Code Online (Sandbox Code Playgroud)
但这就是我得到的:
Row id week_number highest_score
1 dmkb6sv 36 1
2 dn1cd2s 37 2
3 dn43h1k 38 16
4 dn3xf18 38 1
5 dn7i1ko 38 1
6 dnpr9b1 39 1
Run Code Online (Sandbox Code Playgroud)
我要这个:
Row id week_number highest_score
1 dmkb6sv 36 1
2 dn1cd2s 37 2
3 dn43h1k 38 16
6 dnpr9b1 39 1
Run Code Online (Sandbox Code Playgroud)
下面是 BigQuery 标准 SQL
#standardSQL
SELECT
EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS week_number,
ARRAY_AGG(id ORDER BY score DESC LIMIT 1)[OFFSET(0)] id,
ARRAY_AGG(score ORDER BY score DESC LIMIT 1)[OFFSET(0)] highest_score
FROM `project.dataset.table`
WHERE body = 'r/twinpeaks'
GROUP BY week_number
ORDER BY week_number
Run Code Online (Sandbox Code Playgroud)
您可以尝试ROW_NUMBER在这里使用:
SELECT *
FROM
(
SELECT id,
CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING) AS week_number,
score,
ROW_NUMBER() OVER (PARTITION BY
CAST(EXTRACT(WEEK FROM TIMESTAMP_SECONDS(created_utc)) AS STRING)
ORDER BY score DESC) rn
FROM t
WHERE body = 'r/twinpeaks'
) t
WHERE rn = 1;
Run Code Online (Sandbox Code Playgroud)
这将返回每个周数中得分最高的记录。我在这里假设您要么首先不关心关系,要么关系不会发生。如果您需要处理平局,则可以使用排名函数来代替行号。
| 归档时间: |
|
| 查看次数: |
14433 次 |
| 最近记录: |