unnest() 未爆炸数组,返回错误列别名列表有 1 个条目,但“t”有 2 列可用

Dou*_*Fir 2 sql presto amazon-athena

我有一些 json 数据,其中包含属性“字符”,它看起来像这样:

select json_data['characters'] from latest_snapshot_events
Run Code Online (Sandbox Code Playgroud)

返回: [{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]

这是在单行上返回的。我希望数组中的每个项目都占一行。

我发现了一些 SO 帖子和其他博客建议我使用unnest(). 我已经尝试了几次,但无法返回结果。例如,这是来自 presto 的文档。底部盖板展开,作为蜂巢横向视图爆炸的替代品:

SELECT student, score
FROM tests
CROSS JOIN UNNEST(scores) AS t (score);
Run Code Online (Sandbox Code Playgroud)

所以我尝试将其应用到我的表中:

characters as (
select
  jdata.characters
from latest_snapshot_events
cross join unnest(json_data) as t(jdata)
)
select * from characters;
Run Code Online (Sandbox Code Playgroud)

其中json_data,latest_snapshot_events 中包含属性“characters”的字段,该属性是一个类似于上面所示的数组。

这会返回一个错误:

[Simba]AthenaJDBC AWS Athena 客户端抛出错误。SYNTAX_ERROR:第 69:12 行:列别名列表有 1 个条目,但“t”有 2 列可用

如何取消嵌套/分解latest_snapshot_events.json_data['characters']到多行?

Mar*_*rso 5

由于characters是文本表示形式的 JSON 数组,因此您必须:

  1. 解析 JSON 文本以生成JSONjson_parse类型的值。
  2. 使用 将 JSON 值转换为 SQL 数组CAST
  3. 使用 分解数组UNNEST

例如:

WITH data(characters) AS (
    VALUES '[{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]'
)
SELECT entry
FROM data, UNNEST(CAST(json_parse(characters) AS array(json))) t(entry)
Run Code Online (Sandbox Code Playgroud)

其产生:

                               entry
-----------------------------------------------------------------------
 {"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,...
 {"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,...
 {"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,...
 {"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,...
Run Code Online (Sandbox Code Playgroud)

在上面的示例中,我将 JSON 值转换为array(json),但如果每个数组条目内的值具有常规架构,您可以进一步将其转换为更具体的内容。例如,对于您的数据,可以将其转换为 an,array(map(varchar, json))因为数组中的每个元素都是 JSON 对象。