我知道Big Query中有一个AVG函数,并且有一些窗口函数可以向上或向下移动上一个或下一个值,但有没有任何函数可以让你在指定的时间间隔内进行平均?例如,我想要像下面这样的东西:
SELECT
city
AVG(temperature) OVER(PARTITION BY city, INTERVAL day,14, ORDER BY day) as rolling_avg_14_days,
AVG(temperature) OVER(PARTITION BY city, INTERVAL day,30, ORDER BY day) as rolling_avg_30_days,
WHERE
city IN ("Los Angeles","Chicago","Sun Prairie","Sunnyvale")
AND year BETWEEN 1900 AND 2013
Run Code Online (Sandbox Code Playgroud)
我想进行滚动平均计算,允许我指定一系列值来执行聚合函数,以及要按什么值排序.平均函数将采用当前温度和之前的13天(或之前的29天)来计算和平均.今天有可能吗?我知道如果我在SELECT语句中放入13个LAG/OVER字段然后平均所有这些字段的结果,我可以做这样的事情,但这是很多开销.
Mik*_*ant 12
我认为窗函数的OVER with RANGE构造最适合这里
假设day字段表示为'YYYY-MM-DD'格式,则下面的查询执行滚动平均
SELECT
city,
day,
AVG(temperature) OVER(PARTITION BY city ORDER BY ts
RANGE BETWEEN 14*24*3600 PRECEDING AND CURRENT ROW) AS rolling_avg_14_days,
AVG(temperature) OVER(PARTITION BY city ORDER BY ts
RANGE BETWEEN 30*24*3600 PRECEDING AND CURRENT ROW) AS rolling_avg_30_days
FROM (
SELECT day, city, temperature, TIMESTAMP_TO_SEC(TIMESTAMP(day)) AS ts
FROM temperatures
)
Run Code Online (Sandbox Code Playgroud)
你很可能很久以前就已经找到了这个解决方案,但是仍然想在这里提出我认为更好的答案(截至今天)
JOIN EACH 的不同选项(这可能会变得太慢,因为中间步骤中会生成大量数据):
SELECT a.SensorId SensorId, a.Timestamp, AVG(b.Data) AS avg_prev_hour_load
FROM (
SELECT * FROM [io_sensor_data.moscone_io13]
WHERE SensorId = 'XBee_40670EB0/mic') a
JOIN EACH [io_sensor_data.moscone_io13] b
ON a.SensorId = b.SensorId
WHERE b.Timestamp BETWEEN (a.Timestamp - 36000000) AND a.Timestamp
GROUP BY SensorId, a.Timestamp;
Run Code Online (Sandbox Code Playgroud)
(基于 Joe Celko 的 SQL 问题)
对于窗口函数,实现更大的范围可能会很有用,但现在我将自动生成查询。
| 归档时间: |
|
| 查看次数: |
3625 次 |
| 最近记录: |