从 BigQuery 表中的数组中获取前 N 个元素

mat*_*252 5 sql arrays select google-bigquery

我有一个数组列,我想获取N它的第一个元素(保持数组数据类型)。有什么好的方法可以做到吗?理想情况下,无需取消嵌套、排名和 array_agg 返回数组。

我也可以这样做(为了获取前两个元素):

WITH data AS
(
  SELECT 1001 as id, ['a', 'b', 'c'] as array_1
  UNION ALL
  SELECT 1002 as id, ['d', 'e', 'f', 'g'] as array_1
  UNION ALL
  SELECT 1003 as id, ['h', 'i'] as array_1
)
select *,
       [array_1[SAFE_OFFSET(0)], array_1[SAFE_OFFSET(1)]] as my_result
from data
Run Code Online (Sandbox Code Playgroud)

但显然这不是一个好的解决方案,因为如果某个数组只有 1 个元素,它就会失败。

Ell*_*ard 5

下面是一个通用的 UDF 解决方案,您可以调用任何数组类型:

CREATE TEMP FUNCTION TopN(arr ANY TYPE, n INT64) AS (
  ARRAY(SELECT x FROM UNNEST(arr) AS x WITH OFFSET off WHERE off < n ORDER BY off)
);

WITH data AS
(
  SELECT 1001 as id, ['a', 'b', 'c'] as array_1
  UNION ALL
  SELECT 1002 as id, ['d', 'e', 'f', 'g'] as array_1
  UNION ALL
  SELECT 1003 as id, ['h', 'i'] as array_1
)
select *, TopN(array_1, 2) AS my_result
from data
Run Code Online (Sandbox Code Playgroud)

它使用 unnest 和数组函数,听起来您不想使用它,但它的优点是足够通用,您可以将任何数组传递给它。