在 Presto 中添加两张地图

Nov*_*ice 3 presto

我的数据如下

customer_id   usage_month  usage_by_product         usage
1             June         {"A":50, "B":50}         100
1             July         {"A":50, "B":10, "C":20} 80
1             Aug          {"A":50, "D":500}        550
1             Sep          {"C" :30}                30
Run Code Online (Sandbox Code Playgroud)

我想编写一个查询来汇总全年的总使用量

customer_id   usage_by_product    usage
 1            {"A": 150, "B":60   760
               "C": 50, "D":500}
Run Code Online (Sandbox Code Playgroud)

是否可以在 Athena (Presto) 的地图上进行这种聚合?

Pio*_*sen 5

您可以使用map_entries+将地图拆分为单独的键/值对UNNEST。然后,将值汇总并聚合回map.

例如:

WITH input AS (
    SELECT * FROM (VALUES
        (1, map(array['a', 'c'], array[50, 42])), 
        (1, map(array['a', 'b'], array[50, 18]))
    ) t(customer_id, m)
),
sum_by_map_key AS (
    SELECT customer_id, k, sum(v) AS s
    FROM input
    CROSS JOIN UNNEST(map_entries(m)) AS u(k, v)
    GROUP BY customer_id, k
)
SELECT customer_id, map_agg(k, s)
FROM sum_by_map_key
GROUP BY customer_id;
Run Code Online (Sandbox Code Playgroud)

输出:

 customer_id |        _col1
-------------+---------------------
           1 | {a=100, b=18, c=42}
(1 row)
Run Code Online (Sandbox Code Playgroud)

注意:要像这样添加两个地图,您可以使用map_zip_with. 但是,要在聚合多行时使用它,您可能需要将所有映射值聚合为单个array并对其运行数组缩减。array(map)根据这些数组的大小,将所有映射聚合为单个可能有效也可能无效。