我需要对Portmanteau主要测试文章进行深入研究,为此,我必须在不同的场景,样本量和不同的ARMA模型(p,q)下评估它们,从而生成180个场景,这需要我花费6个小时。用R和Rcpp编程我的函数,但是我发现惊奇的是,在C ++中,它速度慢,我的问题是为什么?
我的R代码:
Portmanteau <- function(x,h=1,type = c("Box-Pierce","Ljun-Box","Monti"),fitdf = 0){
Ti <- length(x)
df <- h-fitdf
ri <- acf(x, lag.max = h, plot = FALSE, na.action = na.pass)
pi <- pacf(x, lag.max = h, plot = FALSE, na.action = na.pass)
if(type == "Monti"){d<-0} else{d<-1}
if(type == "Box-Pierce"){wi <- 1} else{wi <- (Ti+2)/seq(Ti-1,Ti-h)}
Q <- Ti*(d*sum(wi*identity(ri$acf[-1]^2))+(1-d)*sum(wi*identity(pi$acf^2)))
pv <- pchisq(Q,df,lower.tail = F)
result <- cbind(Statistic = Q, df,p.value = pv)
rownames(result) <- paste(type,"test")
return(result)
}
Run Code Online (Sandbox Code Playgroud)
我的Rcpp代码
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector PortmanteauC(NumericVector x, int h = 1,const char* type = "Box-Pierce" ,int fitdf = 0) {
Environment stats("package:stats");
Function acf = stats["acf"];
Function pacf = stats["pacf"];
Function na_pass = stats["na.pass"];
List ri = acf(x, h, "correlation", false, na_pass);
List pi = pacf(x, h, false, na_pass);
int Ti = x.size();
int df = h - fitdf;
double d;
NumericVector wi;
NumericVector rk = ri["acf"];
NumericVector pk = pi["acf"];
NumericVector S(h);
for(int i = 0; i < h; ++i){S[i] = Ti-i-1;}
rk.erase(0);
if(strcmp(type,"Monti") == 0){d=0;} else{d=1;}
if(strcmp(type,"Box-Pierce") == 0){wi = rep(1,h);} else{wi = (Ti+2)/S;}
double Q = Ti*(d*sum(wi*pow(rk,2)) + (1-d)*sum(wi*pow(pk,2)));
double pv = R::pchisq(Q,df,0,false);
NumericVector result(3);
result[0] = Q;
result[1] = df;
result[2] = pv;
return(result);
}
Run Code Online (Sandbox Code Playgroud)
例
set.seed(1)
y = arima.sim(model = list(ar = 0.5), n = 250)
mod = arima(y, order = c(1,0,0))
res = mod$residuals
Run Code Online (Sandbox Code Playgroud)
箱式皮尔斯
library(rbenchmark)
benchmark(PortmanteauC(res, h=10, type = "Box-Pierce",fitdf = 1),replications = 500,Portmanteau(res,h = 10, type = "Box-Pierce", fitdf= 1),
Box.test(res, lag = 10, type = "Box-Pierce", fitdf= 1))[,1:4]
test replications elapsed relative
3 Box.test(res, lag = 10, type = "Box-Pierce", fitdf = 1) 500 0.17 1.000
2 Portmanteau(res, h = 10, type = "Box-Pierce", fitdf = 1) 500 0.44 2.588
1 PortmanteauC(res, h = 10, type = "Box-Pierce", fitdf = 1) 500 1.82 10.706
Run Code Online (Sandbox Code Playgroud)
君盒
benchmark(Box.test(res, lag = 5, type = "Ljung-Box", fitdf= 1),replications = 500,
Portmanteau(res,h = 5, type = "Ljung-Box", fitdf= 1),
PortmanteauC(res,h = 5, type = "Ljung-Box", fitdf= 1))[,1:4]
test replications elapsed relative
1 Box.test(res, lag = 5, type = "Ljung-Box", fitdf = 1) 500 0.17 1.000
2 Portmanteau(res, h = 5, type = "Ljung-Box", fitdf = 1) 500 0.45 2.647
3 PortmanteauC(res, h = 5, type = "Ljung-Box", fitdf = 1) 500 1.84 10.824
Run Code Online (Sandbox Code Playgroud)
我本来希望Rcpp比字节编译的R快得多。
让我们分析一下R代码的性能属性。由于单个调用是如此之快,以至于R所提供的采样探查器无法轻易使用,我只是简单repeat()地重复使用该代码直到被中断:
Portmanteau <- function(x,h=1,type = c("Box-Pierce","Ljun-Box","Monti"),fitdf = 0){
Ti <- length(x)
df <- h-fitdf
ri <- acf(x, lag.max = h, plot = FALSE, na.action = na.pass)
pi <- pacf(x, lag.max = h, plot = FALSE, na.action = na.pass)
if(type == "Monti"){d<-0} else{d<-1}
if(type == "Box-Pierce"){wi <- 1} else{wi <- (Ti+2)/seq(Ti-1,Ti-h)}
Q <- Ti*(d*sum(wi*identity(ri$acf[-1]^2))+(1-d)*sum(wi*identity(pi$acf^2)))
pv <- pchisq(Q,df,lower.tail = F)
result <- cbind(Statistic = Q, df,p.value = pv)
rownames(result) <- paste(type,"test")
return(result)
}
set.seed(1)
profvis::profvis({
repeat({
y = arima.sim(model = list(ar = 0.5), n = 250)
mod = arima(y, order = c(1,0,0))
res = mod$residuals
Portmanteau(res, h = 10, type = "Box-Pierce", fitdf = 1)
})
})
Run Code Online (Sandbox Code Playgroud)
我让它运行约49秒。RStudio提供的部分图形输出可在此处看到:
我们从中学习:
arima()花的时间大约是七倍Portmenteau()。根据这两个函数之间的调用比例,您可能正在优化错误的函数。Portmenteau()通话,几乎整个时间都花在pacf()和上acf()。这些R函数也可以在您的Rcpp代码中使用,但是具有从C ++返回R的复杂性。这解释了为什么您的C ++比R代码慢。