Har*_*ryD 3 sql sum count window-functions snowflake-cloud-data-platform
如果不求助于 CTE 或子查询,是否有任何方法可以使用具有与 GROUP BY 不同的汇总级别的 Window 功能?COUNT(*) 有效,但如果在 COUNT 中指定了列名或使用了 SUM 函数,则会出现“不是有效的表达式组”的查询错误。即使 PARTITION BY 列与 GROUP BY 相同,错误结果也是如此。
注释掉的行将导致查询失败。这正是为这些类型的东西是一个想在第一时间使用窗口功能。
create table sales (product_id integer, retail_price real, quantity integer, city varchar, state varchar);
insert into sales (product_id, retail_price, quantity, city, state) values
(1, 2.00, 1, 'SF', 'CA'),
(1, 2.00, 2, 'SJ', 'CA'),
(2, 5.00, 4, 'SF', 'CA'),
(2, 5.00, 8, 'SJ', 'CA'),
(2, 5.00, 16, 'Miami', 'FL'),
(2, 5.00, 32, 'Orlando', 'FL'),
(2, 5.00, 64, 'SJ', 'PR');
select city, state
, count(*) as city_sale_cnt
, count(*) over ( partition by state) as state_sale_cnt
-- , count(product_id) over ( partition by state) as state_sale_cnt2
, sum(retail_price) as city_price
-- , sum(retail_price) over ( partition by state) as state_price
from sales
group by 1,2;
Run Code Online (Sandbox Code Playgroud)
该文档显示窗口的功能可能会导致问题,包括模糊警告“PARTITION BY并不总是与GROUP BY兼容”:错误消息SQL编译错误:...不被表达的有效组往往是一个迹象,表明不同列在 SELECT 语句的“project”子句中的分区方式不同,因此可能会产生不同数量的行。
注释掉的代码不正确。原因是窗口函数是在“之后”解析的group by,而没有product_id或retail_price之后group by。
这很容易解决:
select city, state,
count(*) as city_sale_cnt,
count(*) over (partition by state) as state_sale_cnt,
sum(count(product_id)) over (partition by state) as ,
sum(retail_price) as city_price,
sum(sum(retail_price)) over ( partition by state) as state_price
from sales
group by 1, 2;
Run Code Online (Sandbox Code Playgroud)
起初,在聚合查询中使用窗口函数看起来有点混乱——嵌套聚合函数看起来很笨拙。我发现,虽然它很容易习惯语法,但一旦你使用过几次。
尽管雪花可能允许这样做(如 Gordon Linoff 所演示的),但我主张包装聚合查询并在外部查询中使用窗口函数。
很少有 RDBMS 允许混合窗口函数和聚合,并且生成的查询通常很难理解(除非您是像 Gordon 这样的真正的 SQL 向导!)。
select
t.*,
sum(city_sale_cnt) over (partition by state) as state_sale_cnt,
sum(city_price) over ( partition by state) as state_price
from (
select
city,
state,
count(*) as city_sale_cnt,
sum(retail_price) as city_price
from sales
group by 1,2
) t;
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
14204 次 |
| 最近记录: |