Mat*_*ell 195
或者您可以使用过滤器简单地计算它,这是我使用的函数:
dplyr
pip*_*ish 26
使用cumsum
应该足够和有效.假设你有一个向量x,你想要一个n个数的运行总和
cx <- c(0,cumsum(x))
rsum <- (cx[(n+1):length(cx)] - cx[1:(length(cx) - n)]) / n
Run Code Online (Sandbox Code Playgroud)
正如@mzuther的评论中所指出的,这假设数据中没有NA.处理那些需要将每个窗口除以非NA值的数量.这是一种方法,结合@Ricardo Cruz的评论:
cx <- c(0, cumsum(ifelse(is.na(x), 0, x)))
cn <- c(0, cumsum(ifelse(is.na(x), 0, 1)))
rx <- cx[(n+1):length(cx)] - cx[1:(length(cx) - n)]
rn <- cn[(n+1):length(cx)] - cn[1:(length(cx) - n)]
rsum <- rx / rn
Run Code Online (Sandbox Code Playgroud)
这仍然存在这样的问题:如果窗口中的所有值都是NA,则会出现零除错误.
jan*_*cki 15
在data.table 1.12.0新的frollmean
功能已经被添加到计算快速准确的滚动平均值认真处理NA
,NaN
并+Inf
,-Inf
值.
由于问题中没有可复制的例子,因此这里没有更多要解决的问题.
您可以在线找到更多关于?frollmean
手册的信息,也可以在线获得?frollmean
.
以下手册中的示例:
library(data.table)
d = as.data.table(list(1:6/2, 3:8/4))
# rollmean of single vector and single window
frollmean(d[, V1], 3)
# multiple columns at once
frollmean(d, 3)
# multiple windows at once
frollmean(d[, .(V1)], c(3, 4))
# multiple columns and multiple windows at once
frollmean(d, c(3, 4))
## three above are embarrassingly parallel using openmp
Run Code Online (Sandbox Code Playgroud)
您可以使用RcppRoll
用C++编写的非常快速的移动平均值.只需调用该roll_mean
函数即可.可以在这里找到文档.
否则,这个(较慢的)for循环应该可以解决问题:
ma <- function(arr, n=15){
res = arr
for(i in n:length(arr)){
res[i] = mean(arr[(i-n):i])
}
res
}
Run Code Online (Sandbox Code Playgroud)
其实RcppRoll
非常好.
cantdutchthis发布的代码必须在第四行修正到窗口固定:
ma <- function(arr, n=15){
res = arr
for(i in n:length(arr)){
res[i] = mean(arr[(i-n+1):i])
}
res
}
Run Code Online (Sandbox Code Playgroud)
另一种处理缺失的方法在这里给出.
第三种方法,改进cantdutchthis代码计算部分平均值与否,如下:
ma <- function(x, n=2,parcial=TRUE){
res = x #set the first values
if (parcial==TRUE){
for(i in 1:length(x)){
t<-max(i-n+1,1)
res[i] = mean(x[t:i])
}
res
}else{
for(i in 1:length(x)){
t<-max(i-n+1,1)
res[i] = mean(x[t:i])
}
res[-c(seq(1,n-1,1))] #remove the n-1 first,i.e., res[c(-3,-4,...)]
}
}
Run Code Online (Sandbox Code Playgroud)
以下示例代码显示了如何使用zoo包中的函数计算居中移动平均线和尾随移动平均线。rollmean
library(tidyverse)
library(zoo)
some_data = tibble(day = 1:10)
# cma = centered moving average
# tma = trailing moving average
some_data = some_data %>%
mutate(cma = rollmean(day, k = 3, fill = NA)) %>%
mutate(tma = rollmean(day, k = 3, fill = NA, align = "right"))
some_data
#> # A tibble: 10 x 3
#> day cma tma
#> <int> <dbl> <dbl>
#> 1 1 NA NA
#> 2 2 2 NA
#> 3 3 3 2
#> 4 4 4 3
#> 5 5 5 4
#> 6 6 6 5
#> 7 7 7 6
#> 8 8 8 7
#> 9 9 9 8
#> 10 10 NA 9
Run Code Online (Sandbox Code Playgroud)
为了补充cantdutchthis和Rodrigo Remedio的回答 ;
moving_fun <- function(x, w, FUN, ...) {
# x: a double vector
# w: the length of the window, i.e., the section of the vector selected to apply FUN
# FUN: a function that takes a vector and return a summarize value, e.g., mean, sum, etc.
# Given a double type vector apply a FUN over a moving window from left to the right,
# when a window boundary is not a legal section, i.e. lower_bound and i (upper bound)
# are not contained in the length of the vector, return a NA_real_
if (w < 1) {
stop("The length of the window 'w' must be greater than 0")
}
output <- x
for (i in 1:length(x)) {
# plus 1 because the index is inclusive with the upper_bound 'i'
lower_bound <- i - w + 1
if (lower_bound < 1) {
output[i] <- NA_real_
} else {
output[i] <- FUN(x[lower_bound:i, ...])
}
}
output
}
# example
v <- seq(1:10)
# compute a MA(2)
moving_fun(v, 2, mean)
# compute moving sum of two periods
moving_fun(v, 2, sum)
Run Code Online (Sandbox Code Playgroud)