use*_*361 4 performance loops r vector count
假设我有x = rnorm(100000),而不是做一个1000长度滑动窗口移动平均,我想做一个1000长度滑动窗口,它计算x上面的所有时间0.2。
例如,
x <- rnorm(1004)
start <- 1:1000
record <- list()
while(start[length(start)] <= length(x)) {
record[[length(record) + 1]] <- length(which(x[start] > 0.2))/length(start)
start <- start + 1
print(record[[length(record)]]);flush.console()
}
Run Code Online (Sandbox Code Playgroud)
这对于大型length(x). 什么是高效的方法?
我的贡献是计算条件累积总和之间的滞后差
cumdiff = function(x) diff(c(0, cumsum( x > .2)), 20)
Run Code Online (Sandbox Code Playgroud)
连同
filt = function(x) filter(x > 0.2, rep(1, 20), sides=1)
library(TTR); ttr = function(x) runSum(x > .2, 20)
cumsub = function(x) { z <- cumsum(c(0, x>0.2)); tail(z,-20) - head(z,-20) }
Run Code Online (Sandbox Code Playgroud)
执行正常
> library(microbenchmark)
> set.seed(123); xx = rnorm(100000)
> microbenchmark(cumdiff(xx), filt(xx), ttr(xx), cumsub(xx))
Unit: milliseconds
expr min lq median uq max neval
cumdiff(xx) 11.192005 12.387862 12.469253 12.77588 13.72404 100
filt(xx) 20.979503 22.058045 22.442765 23.02612 62.91730 100
ttr(xx) 8.390923 10.023934 10.119772 10.46309 11.04173 100
cumsub(xx) 7.015654 8.483432 8.538171 8.73596 9.65421 100
Run Code Online (Sandbox Code Playgroud)
这些在如何表示结果的细节上有所不同(例如filt,ttr具有领先的 NA)并且仅filter处理嵌入式 NA
> xx[22] = NA
> head(cumdiff(xx)) # NA's propagate, silently
[1] 9 9 NA NA NA NA
> ttr(xx)
Error in runSum(x > 0.2, 20) : Series contains non-leading NAs
> tail(filt(xx), -19)
[1] 9 9 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8 8 9
...
Run Code Online (Sandbox Code Playgroud)