ece*_*ulm 5 snowflake-cloud-data-platform
我有一个数据集,其中有一列包含对象数组,如下所示:
ID TAGS
1 {"tags": [{"tag": "a"}, {"tag": "b"}]}
2 {"tags": [{"tag": "c"}, {"tag": "d"}]}
Run Code Online (Sandbox Code Playgroud)
我想提取tag数组每个元素的字段,所以最终结果是:
ID TAGS
1 ["a","b"]
2 ["c","d"]
Run Code Online (Sandbox Code Playgroud)
假设如下表t1:
CREATE OR REPLACE TEMPORARY TABLE t1 AS (
select 1 as ID , PARSE_JSON('{"tags": [{"tag":"a"}, {"tag":"b"}]}') AS PAYLOAD
UNION ALL
select 2, PARSE_JSON('{"tags": [{"tag":"c"}, {"tag":"d"}]}')
);
Run Code Online (Sandbox Code Playgroud)
纯 SQL 方法是将LATERAL FLATTEN和ARRAY_AGG结合起来,如下所示:
with t2 as (
select ID, t2.value:tag as tag
from t1, LATERAL FLATTEN(input => payload:tags) t2
)
select t2.id, ARRAY_AGG(t2.tag) as tags from t2
group by ID
order by ID ASC;
Run Code Online (Sandbox Code Playgroud)
t2 本身将变成:
ID TAG
1 "a"
1 "b"
2 "c"
2 "d"
Run Code Online (Sandbox Code Playgroud)
之后就GROUP BY ID变成:
ID TAGS
1 [ "a", "b" ]
2 [ "c", "d" ]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4244 次 |
| 最近记录: |