在BigQuery中忽略空值合并行

Ben*_*eid 1 sql google-bigquery

我有一个如下所示的Google BigQuery表:

? id ?   col_1    ?  col_2  ? updated ?

?  1 ? first_data ? null    ? 4/22    ?

?  1 ? null       ? old     ? 4/23    ?

?  1 ? null       ? correct ? 4/24    ?
Run Code Online (Sandbox Code Playgroud)

我想构造一个将这些行和“覆盖”空列组合在一起的查询,如果存在具有相同ID且该列不为空的行。本质上,结果应如下所示:

?  1 ? first_data ? correct ? 4/24    ?
Run Code Online (Sandbox Code Playgroud)

如果可能的话,我也希望结果代表历史:

?  1 ? first_data ? old     ? 4/23    ?

?  1 ? first_data ? correct ? 4/24    ?
Run Code Online (Sandbox Code Playgroud)

但这是次要的,没有必要。

Mik*_*ant 5

以下是BigQuery标准SQL

#standardSQL
SELECT id, 
  IFNULL(col_1, FIRST_VALUE(col_1 IGNORE NULLS) OVER(win)) col_1, 
  IFNULL(col_2, FIRST_VALUE(col_2 IGNORE NULLS) OVER(win)) col_2, 
  updated
FROM `project.dataset.your_table`
WINDOW win AS (PARTITION BY id ORDER BY updated DESC 
               ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
-- ORDER BY id, updated
Run Code Online (Sandbox Code Playgroud)

您可以使用以下虚拟数据测试/玩游戏

#standardSQL
WITH `project.dataset.your_table` AS (
  SELECT 1 id, 'first_data' col_1, NULL col_2,  '4/22' updated UNION ALL
  SELECT 1,     NULL,             'old',        '4/23'         UNION ALL
  SELECT 1,     NULL,             'correct',    '4/24'         UNION ALL
  SELECT 1,    'next_data',       NULL,         '4/25'         UNION ALL
  SELECT 1,     NULL,             NULL,         '4/26'         
)
SELECT id, 
  IFNULL(col_1, FIRST_VALUE(col_1 IGNORE NULLS) OVER(win)) col_1, 
  IFNULL(col_2, FIRST_VALUE(col_2 IGNORE NULLS) OVER(win)) col_2, 
  updated
FROM `project.dataset.your_table`
WINDOW win AS (PARTITION BY id ORDER BY updated DESC 
               ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
ORDER BY id, updated
Run Code Online (Sandbox Code Playgroud)

结果

Row id  col_1       col_2   updated  
1   1   first_data  null    4/22     
2   1   first_data  old     4/23     
3   1   first_data  correct 4/24     
4   1   next_data   correct 4/25     
5   1   next_data   correct 4/26     
Run Code Online (Sandbox Code Playgroud)