R:有效地将具有最大互相关的时间序列段定位到输入段？

Question

R:有效地将具有最大互相关的时间序列段定位到输入段？

Mik*_*der 8 r time-series subset correlation

我有一个大约200,000行的长数值时间序列数据(我们称之为Z).

在循环中,我一次从Z中对x(约30)个连续行进行子集化,并将它们视为查询点q.

我想内定位ž的ÿ(〜300)最相关的时间序列的段长度的X(与大多数相关q).

有效的方法是什么？

Answer 1

Jos*_*ien 5

下面的代码找到了你正在寻找的300个细分,并且在我的功能非常强大的Windows笔记本电脑上运行8秒钟,所以它应该足够快到达你的目的.

首先,它构造一个30-by-199971矩阵(Zmat),其列包含您要检查的所有长度为30的"时间序列段".cor()对矢量q和矩阵进行单次调用Zmat,然后计算所有所需的相关系数.最后,检查所得的矢量以识别具有最高相关系数的300个序列.

# Simulate data
nZ <- 200000
nq <- 30
Z <- rnorm(nZ)
q <- seq_len(nq)

# From Z, construct a 30 by 199971 matrix, in which each column is a
# "time series segment". Column 1 contains observations 1:30, column 2
# contains observations 2:31, and so on through the end of the series.
Zmat <- sapply(seq_len(nZ - nq + 1),  
               FUN = function(X) Z[seq(from = X, length.out = nq)])

# Calculate the correlation of q with every column/"time series segment.
Cors <- cor(q, Zmat)

# Extract the starting position of the 300 most highly correlated segments    
ids <- order(Cors, decreasing=TRUE)[1:300]

# Maybe try something like the following to confirm that you have
# selected the most highly correlated segments.
hist(Cors, breaks=100)
hist(Cors[ids], col="red", add=TRUE)

Run Code Online (Sandbox Code Playgroud)

归档时间：	13 年，10 月前
查看次数：	925 次
最近记录：	13 年，10 月前