在 BigQuery 中创建选定列的统计信息摘要

Question

在 BigQuery 中创建选定列的统计信息摘要

我试图从几个数字列中总结一些静态属性，例如连续分位数、平均值、标准差等，然后将它们包装到行中，并将原始列名称作为附加列附加。我知道如何使用AVG, STDDEV_PO, PERCENTILE_CONT... 从单列中获取它们，但没有找到有关同时在多列上执行它们的文章/食谱。有任何想法吗？

输入示例：

ID	第 1 列	第2栏	第3栏
1	1.0	2.0	4.0
2	2.0	4.0	8.0
3	3.0	6.0	12.0
4	4.0	8.0	16.0

预期输出：

校名	Q1	Q2	第三季度	意思是	标准
第 1 列	1.75	2.5	3.25	2.5	1.12
第2栏	3.5	5.0	6.5	5.0	2.24
第3栏	7.0	10.0	13.0	10.0	4.47

或“转置”版本：

统计数据	第 1 列	第2栏	第3栏
Q1	1.75	3.5	7.0
Q2	2.5	5.0	10.0
第三季度	3.25	6.5	13.0
意思是	2.5	5.0	10.0
标准	1.12	2.24	4.47

Answer 1

Mik*_*ant 6

考虑下面

select distinct col, 
  percentile_cont(value, 0.25) over win as q1,
  percentile_cont(value, 0.50) over win as q2,
  percentile_cont(value, 0.75) over win as q3,
  avg(value) over win as avg, 
  stddev_pop(value) over win as std, 
from your_table
unpivot (value for col in (col1, col2, col3))
window win as (partition by col)

Run Code Online (Sandbox Code Playgroud)

如果应用于您问题中的样本数据 - 输出是

要获得“转置”版本 - 使用下面的

select * from (
  select * from (
    select distinct col, 
      percentile_cont(value, 0.25) over win as q1,
      percentile_cont(value, 0.50) over win as q2,
      percentile_cont(value, 0.75) over win as q3,
      avg(value) over win as avg, 
      stddev_pop(value) over win as std, 
    from data
    unpivot (value for col in (col1, col2, col3))
    window win as (partition by col)
  ) unpivot (value for stats in (q1, q2, q3, avg, std))
) pivot (any_value(value) for col in ('col1', 'col2', 'col3'))

Run Code Online (Sandbox Code Playgroud)

在这种情况下 - 输出是

归档时间：	4 年，2 月前
查看次数：	2283 次
最近记录：	4 年，2 月前