如何在bigquery中查询数组?

BI *_*ect 5 google-bigquery

bigquery 字段中的架构:项目类型:字符串

项目字段中表中的值存储为字符串 {"data": [{"id": "1234", "plan": {"sub_id": "567", "metadata": {"currentlySelling": "true", "custom_attributes": "{\"shipping\": true,\"productLimit\":10}", "Features": "[\"10 products\", \"Online support\"]"}, "name": "Personal", "object": "plan"}, "quantity": 1}], "has_more": false}

两个问题 1) 我如何在数组中进行查询,例如:运输是真实的或其中一项功能是“在线支持”的地方 2) 我必须将数据存储为字符串的原因,因为“custom_attributes”值可以改变。当嵌套键之一的值可以更改时,是否有更好的方法在 bigquery 中存储数据?

Ell*_*ard 7

您的查询将是这样的:

#standardSQL
SELECT game
FROM YourTable
WHERE EXISTS (SELECT 1 FROM UNNEST(participant) WHERE name = 'sam');
Run Code Online (Sandbox Code Playgroud)

这将返回所有'sam'参与者的游戏。这是一个独立的示例:

#standardSQL
WITH YourTable AS (
  SELECT 'A' AS game, ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('tony', 12), ('julia', 12)] AS participant UNION ALL
  SELECT 'B', ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('max', 12), ('jacob', 12)] UNION ALL
  SELECT 'C', ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('max', 12), ('julia', 12)]
)
SELECT game
FROM YourTable
WHERE EXISTS (SELECT 1 FROM UNNEST(participant) WHERE name = 'sam');
Run Code Online (Sandbox Code Playgroud)

如果您想将数据透视为每个参与者有一列,您可以使用如下查询:

#standardSQL
CREATE TEMP FUNCTION WasParticipant(
    p_name STRING, participant ARRAY<STRUCT<name STRING, age INT64>>) AS (
  EXISTS(SELECT 1 FROM UNNEST(participant) WHERE name = p_name)
);

WITH YourTable AS (
  SELECT 'A' AS game, ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('tony', 12), ('julia', 12)] AS participant UNION ALL
  SELECT 'B', ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('max', 12), ('jacob', 12)] UNION ALL
  SELECT 'C', ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('max', 12), ('julia', 12)]
)
SELECT
  ARRAY_AGG(IF(WasParticipant('sam', participant), game, NULL) IGNORE NULLS) AS sams_games,
  ARRAY_AGG(IF(WasParticipant('tony', participant), game, NULL) IGNORE NULLS) AS tonys_games,
  ARRAY_AGG(IF(WasParticipant('julia', participant), game, NULL) IGNORE NULLS) AS julias_games,
  ARRAY_AGG(IF(WasParticipant('max', participant), game, NULL) IGNORE NULLS) AS maxs_games
FROM YourTable;
Run Code Online (Sandbox Code Playgroud)

这将返回一个数组,其中包含为每个参与者玩的游戏。