我有一个包含字符串数组字段的 BigQuery 表。对于某些记录,数组可以保存重复的字符串值。
BigQuery UNNEST 子句中是否可以过滤掉重复项,以便 UNNEST 只返回不同的数组字符串值?
有很多方法可以做到这一点。由于您没有指定所需的输入和输出,我将任意选择一个。
用途ARRAY_AGG(DISTINCT):
WITH data AS (
SELECT 1 id, ["a", "a", "b", "e", "a", "c", "b", "a"] strings
)
SELECT id, ARRAY_AGG(DISTINCT string) strings
FROM data, UNNEST(strings) string
GROUP BY id
Run Code Online (Sandbox Code Playgroud)
以下是 BigQuery 标准 SQL
#standardSQL
SELECT * REPLACE(ARRAY(SELECT DISTINCT el FROM t.arr AS el) AS arr)
FROM `project.dataset.table` t
Run Code Online (Sandbox Code Playgroud)
您可以使用虚拟数据来测试、玩上面的内容,如下例所示
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 col1, 2 col2, ["a", "a", "b", "e", "a", "c", "b", "a"] arr, 3 col3 UNION ALL
SELECT 4, 5, ["x", "y", "z"], 5
)
SELECT * REPLACE(ARRAY(SELECT DISTINCT el FROM t.arr AS el) AS arr)
FROM `project.dataset.table` t
Run Code Online (Sandbox Code Playgroud)
带输出
Row col1 col2 arr col3
1 1 2 a 3
b
e
c
2 4 5 x 5
y
z
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
15611 次 |
| 最近记录: |