表太大时ROW_NUMBER()失败

Ale*_*lex 2 sql row-number google-bigquery

我正在使用Bigquery,因此我需要使用ROW_NUMBER()才能仅获取符合某些条件的第一行。

例:

select *except(rn)
from (
SELECT
  *,
  ROW_NUMBER() OVER (PARTITION BY id order by timedate desc) AS rn
FROM
 table
)
where rn = 1
Run Code Online (Sandbox Code Playgroud)

但是,查询将失败,因为表太大。如何在不耗尽资源的情况下应用此类逻辑?

Mik*_*ant 5

以下是BigQuery标准SQL

#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY timedate DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY id
Run Code Online (Sandbox Code Playgroud)

您可以使用以下虚拟数据进行测试,播放

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 id, 2 timedate, 3 z UNION ALL
  SELECT 1,4,5 UNION ALL
  SELECT 1,6,7 UNION ALL
  SELECT 2,8,9 UNION ALL
  SELECT 2, 10, 11
)
SELECT AS VALUE ARRAY_AGG(t ORDER BY timedate DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY id
Run Code Online (Sandbox Code Playgroud)

结果是

Row id  timedate    z    
1   1   6           7    
2   2   10          11   
Run Code Online (Sandbox Code Playgroud)