Zac*_*ach 11 finance r quantmod
我正在运行一个非常类似于以下代码的滚动回归:
library(PerformanceAnalytics)
library(quantmod)
data(managers)
FL <- as.formula(Next(HAM1)~HAM1+HAM2+HAM3+HAM4)
MyRegression <- function(df,FL) {
df <- as.data.frame(df)
model <- lm(FL,data=df[1:30,])
predict(model,newdata=df[31,])
}
system.time(Result <- rollapply(managers, 31, FUN="MyRegression",FL,
by.column = FALSE, align = "right", na.pad = TRUE))
Run Code Online (Sandbox Code Playgroud)
我有一些额外的处理器,所以我试图找到一种方法来并行化滚动窗口.如果这是一个非滚动回归,我可以使用apply系列函数轻松地并行化它...
显而易见的是使用lm.fit()而不是lm()因为你不会在处理公式等时产生所有开销.
更新:所以,当我明确说明我想说的是明显的,但看似难以实现!
经过一番摆弄后,我想出了这个
library(PerformanceAnalytics)
library(quantmod)
data(managers)
Run Code Online (Sandbox Code Playgroud)
第一阶段是要意识到模型矩阵可以预先构建,所以我们这样做并将其转换回Zoo对象以用于rollapply():
mmat2 <- model.frame(Next(HAM1) ~ HAM1 + HAM2 + HAM3 + HAM4, data = managers,
na.action = na.pass)
mmat2 <- cbind.data.frame(mmat2[,1], Intercept = 1, mmat2[,-1])
mmatZ <- as.zoo(mmat2)
Run Code Online (Sandbox Code Playgroud)
现在我们需要一个函数lm.fit()来完成繁重的工作,而不必在每次迭代时创建设计矩阵:
MyRegression2 <- function(Z) {
## store value we want to predict for
pred <- Z[31, -1, drop = FALSE]
## get rid of any rows with NA in training data
Z <- Z[1:30, ][!rowSums(is.na(Z[1:30,])) > 0, ]
## Next() would lag and leave NA in row 30 for response
## but we precomputed model matrix, so drop last row still in Z
Z <- Z[-nrow(Z),]
## fit the model
fit <- lm.fit(Z[, -1, drop = FALSE], Z[,1])
## get things we need to predict, in case pivoting turned on in lm.fit
p <- fit$rank
p1 <- seq_len(p)
piv <- fit$qr$pivot[p1]
## model coefficients
beta <- fit$coefficients
## this gives the predicted value for row 31 of data passed in
drop(pred[, piv, drop = FALSE] %*% beta[piv])
}
Run Code Online (Sandbox Code Playgroud)
时间比较:
> system.time(Result <- rollapply(managers, 31, FUN="MyRegression",FL,
+ by.column = FALSE, align = "right",
+ na.pad = TRUE))
user system elapsed
0.925 0.002 1.020
>
> system.time(Result2 <- rollapply(mmatZ, 31, FUN = MyRegression2,
+ by.column = FALSE, align = "right",
+ na.pad = TRUE))
user system elapsed
0.048 0.000 0.05
Run Code Online (Sandbox Code Playgroud)
这比原版提供了相当合理的改进.现在检查生成的对象是否相同:
> all.equal(Result, Result2)
[1] TRUE
Run Code Online (Sandbox Code Playgroud)
请享用!