多次汇总的解决方法

Dav*_*542 6 sql rollup combinatorics google-bigquery

有没有办法在 BigQuery 中完成以下任务?Postgres等数据库支持此语法:

SELECT ProductGroup, Product, Year, Month, AVG(Revenue) 
FROM Sales
group by rollup(ProductGroup, Product), rollup(Year, Month)
Run Code Online (Sandbox Code Playgroud)

换句话说,我想要两个汇总的叉积:

ROLLUP(ProductGroup, Product) --> (), (ProductGroup), (ProductGroup, Product)
ROLLUP(Year, Month) --> (), (Year), (Year, Month)

((), (ProductGroup), (ProductGroup, Product)) x ((), (Year), (Year, Month))
= (
    (), (ProductGroup), (ProductGroup, Product),
    (Year), (Year, ProductGroup), (Year, ProductGroup, Product).
     (Year, Month), (Year, Month, ProductGroup), (Year, Month, ProductGroup, Product)
)
Run Code Online (Sandbox Code Playgroud)

在 BQ 中尝试时出现以下错误:

仅当 [2:10] 处没有其他分组元素时,GROUP BY 子句才支持 ROLLUP


这是包含一些示例图片和数据的更新

首先,我想复制 Excel 数据透视表的功能。这就是 ROWS 和 COLS 汇总的叉积发挥作用的地方:

在此输入图像描述

请注意,数据透视表有 63 个值单元格。

现在,正确的 SQL 语法如下(仅详细GROUP BY语法):

在此输入图像描述

请注意,这也恰好生成 63 行(并且因为我们有一个值列 - SUM Revenue - 63 行 x 1 列 = 63 个值单元格)。查询如下:

with sales as (
    select 2010 Year, 'Jan' Month, 'Electronics' ProductGroup, 'Phone' Product, 100 Revenue union all
    select 2010,    'Jan',  'Electronics',  'Laptop',   200 union all
    select 2010,    'Jan',  'Cars', 'Jeep', 250 union all
    select 2010,    'Jan',  'Cars', 'Hummer',   105 union all
    select 2010,    'Feb',  'Electronics',  'Phone',    110 union all
    select 2010,    'Feb',  'Electronics',  'Laptop',   300 union all
    select 2010,    'Feb',  'Cars', 'Jeep', 50 union all
    select 2010,    'Feb',  'Cars', 'Hummer',   75 union all
    select 2010,    'Mar',  'Electronics',  'Phone',    80 union all
    select 2010,    'Mar',  'Electronics',  'Laptop',   200 union all
    select 2010,    'Mar',  'Cars', 'Jeep', 100 union all
    select 2010,    'Mar',  'Cars', 'Hummer',   50 union all
    select 2011,    'Jan',  'Electronics',  'Phone',    200 union all
    select 2011,    'Jan',  'Electronics',  'Laptop',   300 union all
    select 2011,    'Jan',  'Cars', 'Jeep', 100 union all
    select 2011,    'Jan',  'Cars', 'Hummer',   200 union all
    select 2011,    'Feb',  'Electronics',  'Phone',    300 union all
    select 2011,    'Feb',  'Electronics',  'Laptop',   900 union all
    select 2011,    'Feb',  'Cars', 'Jeep', 100 union all
    select 2011,    'Feb',  'Cars', 'Hummer',   200 union all
    select 2011,    'Mar',  'Electronics',  'Phone',    400 union all
    select 2011,    'Mar',  'Electronics',  'Laptop',   350 union all
    select 2011,    'Mar',  'Cars', 'Jeep', 240 union all
    select 2011,    'Mar',  'Cars', 'Hummer',   130
)
-- ROLLUP(ProductGroup, Product), ROLLUP(Year, Month)
--> (), (ProductGroup), (ProductGroup, Product)
--> (Year), (Year, ProductGroup), (Year, ProductGroup, Product)
--> (Year, Month), (Year, Month, ProductGroup), (Year, Month, ProductGroup, Product)

SELECT NULL, NULL, NULL, NULL, AVG(Revenue) FROM Sales UNION ALL                                                -- ()
SELECT ProductGroup, NULL, NULL, NULL, AVG(Revenue) FROM Sales GROUP BY ProductGroup UNION ALL                  -- (ProductGroup)
SELECT ProductGroup, Product, NULL, NULL, AVG(Revenue) FROM Sales GROUP BY ProductGroup, Product UNION ALL      -- (ProductGroup, Product)

SELECT NULL, NULL, Year, NULL, AVG(Revenue) FROM Sales GROUP BY Year UNION ALL                                  -- (Year)
SELECT ProductGroup, NULL, Year, NULL, AVG(Revenue) FROM Sales GROUP BY Year, ProductGroup UNION ALL            -- (Year, ProductGroup)
SELECT ProductGroup, Product, Year, NULL, AVG(Revenue) FROM Sales GROUP BY Year, ProductGroup, Product UNION ALL-- (Year, ProductGroup, Product)

SELECT NULL, NULL, Year, Month, AVG(Revenue) FROM Sales GROUP BY Year, Month UNION ALL                          -- (Year, Month)
SELECT ProductGroup, NULL, Year, Month, AVG(Revenue) FROM Sales GROUP BY ProductGroup, Year, Month UNION ALL    -- (ProductGroup, Year, Month)
SELECT ProductGroup, Product, Year, Month, AVG(Revenue) FROM Sales GROUP BY ProductGroup, Product, Year, Month  -- (ProductGroup, Product, Year Month)
Run Code Online (Sandbox Code Playgroud)

然而,这个查询对于产品来说确实是一场噩梦 - 即使以编程方式生成 - 因为可能有order by, subselect, ... 等,并且将所有这些语句联合在一起可能会变成一个可怕的结构(例如, 3 rows x 3 cols 构造与 100 行 SQL 语句将成为 4^2 * 100 行 sql,5x5 将是 5^2 * 100 行,等等(如果我的数学是正确的)。

那么这样做的正确方法是什么?请注意,在像 Postgres 这样的数据库中,以下内容按原样工作:

SELECT ProductGroup, Product, Year, Month, AVG(Revenue) FROM Sales GROUP BY ROLLUP(ProductGroup, Product), ROLLUP(Year, Month);
Run Code Online (Sandbox Code Playgroud)

如果您想使用它作为起点,这里是已保存的查询:https://console.cloud.google.com/bigquery?sq=260144861653:552549d2a81a47b59df6e3d16ef9bf17


GROUPING SETS最后,如果您认为添加到和中是一个有用的功能CUBE,请投票支持此功能请求: https: //issuetracker.google.com/issues/204913323

Mik*_*ant 0

我想要两个汇总的叉积:

考虑下面

select * from (
select date, code 
from `first-outlet-750.tests.parq_stored`
group by rollup(date, code)
), (
select country, state 
from `first-outlet-750.tests.parq_stored`
group by rollup(country, state))           
Run Code Online (Sandbox Code Playgroud)

输出如下

在此输入图像描述