BigQuery 数组 UNNEST 返回数组中的不同值?

use*_*835 3 google-bigquery

我有一个包含字符串数组字段的 BigQuery 表。对于某些记录,数组可以保存重复的字符串值。

BigQuery UNNEST 子句中是否可以过滤掉重复项,以便 UNNEST 只返回不同的数组字符串值?

Fel*_*ffa 9

有很多方法可以做到这一点。由于您没有指定所需的输入和输出,我将任意选择一个。

用途ARRAY_AGG(DISTINCT)

WITH data AS (
  SELECT 1 id, ["a", "a", "b", "e", "a", "c", "b", "a"] strings
)


SELECT id, ARRAY_AGG(DISTINCT string) strings
FROM data, UNNEST(strings) string
GROUP BY id
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述


Mik*_*ant 5

以下是 BigQuery 标准 SQL

#standardSQL
SELECT * REPLACE(ARRAY(SELECT DISTINCT el FROM t.arr AS el) AS arr)
FROM `project.dataset.table` t    
Run Code Online (Sandbox Code Playgroud)

您可以使用虚拟数据来测试、玩上面的内容,如下例所示

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 col1, 2 col2, ["a", "a", "b", "e", "a", "c", "b", "a"] arr, 3 col3 UNION ALL
  SELECT 4, 5, ["x", "y", "z"], 5
)
SELECT * REPLACE(ARRAY(SELECT DISTINCT el FROM t.arr AS el) AS arr)
FROM `project.dataset.table` t   
Run Code Online (Sandbox Code Playgroud)

带输出

Row col1    col2    arr col3     
1   1       2       a   3    
                    b        
                    e        
                    c        
2   4       5       x   5    
                    y        
                    z        
Run Code Online (Sandbox Code Playgroud)