Tho*_*rst 3 r forecasting dplyr
我想用dplyr来预测几个模型.模型适用于时间序列数据,因此每小时都是自己的模型.即,小时= 1是模型,小时= 18是模型.
例:
# Historical data - Basis for the models:
df.h <- data.frame(
hour = factor(rep(1:24, each = 100)),
price = runif(2400, min = -10, max = 125),
wind = runif(2400, min = 0, max = 2500),
temp = runif(2400, min = - 10, max = 25)
)
# Forecasted data for wind and temp:
df.f <- data.frame(
hour = factor(rep(1:24, each = 10)),
wind = runif(240, min = 0, max = 2500),
temp = runif(240, min = - 10, max = 25)
)
Run Code Online (Sandbox Code Playgroud)
我可以按小时计算每个模型:
df.h.1 <- filter(df.h, hour == 1)
fit = Arima(df.h.1$price, xreg = df.h.1[, 3:4], order = c(1,1,0))
df.f.1 <- filter(df.f, hour == 1)
forecast.Arima(fit, xreg = df.f.1[ ,2:3])$mean
Run Code Online (Sandbox Code Playgroud)
但做这样的事情真是太棒了:
fits <- group_by(df.h, hour) %>%
do(fit = Arima(df.h$price, order= c(1, 1, 0), xreg = df.h[, 3:4]))
df.f %>% group_by(hour)%>% do(forecast.Arima(fits, xreg = .[, 2:3])$mean)
Run Code Online (Sandbox Code Playgroud)
如果要将其打包到一个呼叫中,可以将数据绑定到一个数据中data.frame,然后在do呼叫中再次将其拆分.
df <- rbind(df.h, data.frame(df.f, price=NA))
res <- group_by(df, hour) %>% do({
hist <- .[!is.na(.$price), ]
fore <- .[is.na(.$price), c('hour', 'wind', 'temp')]
fit <- Arima(hist$price, xreg = hist[,3:4], order = c(1,1,0))
data.frame(fore[], price=forecast.Arima(fit, xreg = fore[ ,2:3])$mean)
})
res
Run Code Online (Sandbox Code Playgroud)