我想在向量中找到位置,其中值与向量中较早的点相差超过某个阈值.应该相对于矢量中的第一个值来测量第一个变化点.应相对于先前的变化点测量后续变化点.
我可以使用for循环来做到这一点,但我想知道是否有更惯用和更快的矢量化灵魂.
最小的例子:
set.seed(123)
x = cumsum(rnorm(500))
mindiff = 5.0
start = x[1]
changepoints = integer()
for (i in 1:length(x)) {
if (abs(x[i] - start) > mindiff) {
changepoints = c(changepoints, i)
start = x[i]
}
}
plot(x, type = 'l')
points(changepoints, x[changepoints], col='red')
Run Code Online (Sandbox Code Playgroud)
在中实现相同的代码Rcpp可以帮助提高速度。
library(Rcpp)
cppFunction(
"IntegerVector foo(NumericVector vect, double difference){
int start = 0;
IntegerVector changepoints;
for (int i = 0; i < vect.size(); i++){
if((vect[i] - vect[start]) > difference || (vect[start] - vect[i]) > difference){
changepoints.push_back (i+1);
start = i;
}
}
return(changepoints);
}"
)
foo(vect = x, difference = mindiff)
# [1] 17 25 56 98 108 144 288 297 307 312 403 470 487
identical(foo(vect = x, difference = mindiff), changepoints)
#[1] TRUE
Run Code Online (Sandbox Code Playgroud)
标杆管理
#DATA
set.seed(123)
x = cumsum(rnorm(1e5))
mindiff = 5.0
library(microbenchmark)
microbenchmark(baseR = {start = x[1]
changepoints = integer()
for (i in 1:length(x)) {
if (abs(x[i] - start) > mindiff) {
changepoints = c(changepoints, i)
start = x[i]
}
}}, Rcpp = foo(vect = x, difference = mindiff))
#Unit: milliseconds
# expr min lq mean median uq max neval cld
# baseR 117.194668 123.07353 125.98741 125.56882 127.78463 139.5318 100 b
# Rcpp 7.907011 11.93539 14.47328 12.16848 12.38791 263.2796 100 a
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
243 次 |
| 最近记录: |