我的数据如下
customer_id usage_month usage_by_product usage
1 June {"A":50, "B":50} 100
1 July {"A":50, "B":10, "C":20} 80
1 Aug {"A":50, "D":500} 550
1 Sep {"C" :30} 30
Run Code Online (Sandbox Code Playgroud)
我想编写一个查询来汇总全年的总使用量
customer_id usage_by_product usage
1 {"A": 150, "B":60 760
"C": 50, "D":500}
Run Code Online (Sandbox Code Playgroud)
是否可以在 Athena (Presto) 的地图上进行这种聚合?
您可以使用map_entries+将地图拆分为单独的键/值对UNNEST。然后,将值汇总并聚合回map.
例如:
WITH input AS (
SELECT * FROM (VALUES
(1, map(array['a', 'c'], array[50, 42])),
(1, map(array['a', 'b'], array[50, 18]))
) t(customer_id, m)
),
sum_by_map_key AS (
SELECT customer_id, k, sum(v) AS s
FROM input
CROSS JOIN UNNEST(map_entries(m)) AS u(k, v)
GROUP BY customer_id, k
)
SELECT customer_id, map_agg(k, s)
FROM sum_by_map_key
GROUP BY customer_id;
Run Code Online (Sandbox Code Playgroud)
输出:
customer_id | _col1
-------------+---------------------
1 | {a=100, b=18, c=42}
(1 row)
Run Code Online (Sandbox Code Playgroud)
注意:要像这样添加两个地图,您可以使用map_zip_with. 但是,要在聚合多行时使用它,您可能需要将所有映射值聚合为单个array并对其运行数组缩减。array(map)根据这些数组的大小,将所有映射聚合为单个可能有效也可能无效。
| 归档时间: |
|
| 查看次数: |
5155 次 |
| 最近记录: |