为每个具有最新时间戳的唯一 ID 选择一个

Mic*_*iak 3 google-bigquery

我在 Big Query 中有一个表,其中包含唯一 ID、时间戳和距离,我想通过带有最新时间戳的 ID 选择一条记录。

例如,桌子看起来像

ID|timestamp|distance
A|100|2
A|90|3
B|110|5
D|100|4
A|80|2
B|10|2
Run Code Online (Sandbox Code Playgroud)

查询应返回如下内容:

A|100|2
B|110|5
D|100|4
Run Code Online (Sandbox Code Playgroud)

PostgreSQL 中的工作查询看起来像这样,但 bigquery 中没有“distinct ON”?

SELECT * FROM (
SELECT DISTINCT ON (ID)
id, timestamp, distance
FROM ranking
ORDER BY ID, timestamp DESC
) AS latest_dtg
ORDER BY distance
Run Code Online (Sandbox Code Playgroud)

Mik*_*ant 5

Below is for BigQuery Standard SQL

#standardSQL
SELECT row.* FROM (
  SELECT ARRAY_AGG(r ORDER BY timestamp DESC LIMIT 1)[OFFSET(0)] AS row
  FROM ranking AS r
  GROUP BY id
)
Run Code Online (Sandbox Code Playgroud)

You can play/test with below dummy data from your question

#standardSQL
WITH ranking AS (
  SELECT 'A' AS id, 100 AS timestamp, 2 AS distance UNION ALL
  SELECT 'A', 90, 3 UNION ALL
  SELECT 'B', 110, 5 UNION ALL
  SELECT 'D', 100, 4 UNION ALL
  SELECT 'B', 10, 2 UNION ALL
  SELECT 'A', 80, 2
)
SELECT row.* FROM (
  SELECT ARRAY_AGG(r ORDER BY timestamp DESC LIMIT 1)[OFFSET(0)] AS row
  FROM ranking AS r
  GROUP BY id
)
Run Code Online (Sandbox Code Playgroud)