重用BigQuery窗口功能分区

Cha*_*her 8 google-bigquery

我需要选择多个列作为LEAD语句的一部分.这看起来效率非常低,所需的排序和分区数量增加了两倍 - >

SELECT 
    field,
    field2,
    field3,
    LEAD(field, 1) OVER (PARTITION BY field ORDER BY field ASC) AS nextField,
    LEAD(field2, 1) OVER (PARTITION BY field ORDER BY field ASC) AS nextField2,
    LEAD(field3, 1) OVER (PARTITION BY field ORDER BY field ASC) AS nextField3,
FROM dataset.table
Run Code Online (Sandbox Code Playgroud)
  • 有一个更好的方法吗?
  • BigQuery是否在查询运行时针对此进行优化以使其高效?

Mos*_*sky 9

几点要加上米哈伊尔的回答:

  1. 是的,BigQuery优化它 - 如果窗口框架相同,它将只设置一次,并且多个函数将在其上运行.

  2. 你是对的,一遍又一遍地编写相同的框架是很繁琐的,因此我们致力于改进BigQuery SQL方言以使其更符合标准,并且在不久的将来*你将能够写


SELECT 
    field,
    field2,
    field3,
    LEAD(field, 1) OVER w1 AS nextField,
    LEAD(field2, 1) OVER w1 AS nextField2,
    LEAD(field3, 1) OVER w1 AS nextField3,
FROM dataset.table
WINDOW w1 AS (PARTITION BY field ORDER BY field ASC)
Run Code Online (Sandbox Code Playgroud)

*不能真正给你确定日期,但这是现在的内部测试,所以不应该太长.