如何在BigQuery中将字符串化数组转换为数组?

Bor*_*Hna 12 google-bigquery bigquery-standard-sql

碰巧我在BigQuery的字段中有一个字符串化数组

'["a","b","c"]'
Run Code Online (Sandbox Code Playgroud)

我想将它转换为BigQuery理解的数组.我希望能够在标准SQL中执行此操作:

with k as (select '["a","b","c"]' as x)
select x from k, unnest(x) x
Run Code Online (Sandbox Code Playgroud)

我已经尝试过,JSON_EXTRACT('["a","b","c"]','$')而且我可以在网上找到其他所有东西.

有任何想法吗?

Mik*_*ant 15

以下是BigQuery Standard SQL

#standardSQL
WITH k AS (
  SELECT 1 AS id, '["a","b","c"]' AS x UNION ALL
  SELECT 2, '["x","y"]' 
)
SELECT 
  id, 
  ARRAY(SELECT * FROM UNNEST(SPLIT(SUBSTR(x, 2 , LENGTH(x) - 2)))) AS x
FROM k
Run Code Online (Sandbox Code Playgroud)

它将您的字符串列转换为数组列


Rya*_*uck 6

此解决方案更新了@northtree 的答案,并且更优雅地处理将数组成员作为字符串化 JSON 对象返回,而不是返回[object Object]字符串:

CREATE TEMP FUNCTION
  JSON_EXTRACT_ARRAY(input STRING)
  RETURNS ARRAY<STRING>
  LANGUAGE js AS """  
return JSON.parse(input).map(x => JSON.stringify(x));
""";

with

raw as (
  select
    1 as id,
    '[{"a": 5, "b": 6}, {"a": 7}, 456]' as body
)

select
  id,
  entry,
  json_extract(entry, '$'),
  json_extract(entry, '$.a'),
  json_extract(entry, '$.b')
from
  raw,
  unnest(json_extract_array(body)) as entry
Run Code Online (Sandbox Code Playgroud)


Alo*_*nme 5

最近(2020)该JSON_EXTRACT_ARRAY功能被添加到bigquery标准sql中。

无需 UDF 或技巧即可轻松获得预期行为

with k as (select JSON_EXTRACT_ARRAY('["a","b","c"]', '$') as x)
select unnested_x from k, unnest(x) unnested_x
Run Code Online (Sandbox Code Playgroud)

会导致:

????????????????
? "unnested_x" ?
????????????????
?     "a"      ?
?     "b"      ?
?     "c"      ?
????????????????
Run Code Online (Sandbox Code Playgroud)

JSON_EXTRACT_ARRAY 文档