Rdo*_*dou 10 python numpy scipy
我想计算3个阵列X1,X2和Y的自协方差,它们都是固定随机过程.sciPy或其他库中是否有任何功能可以解决这个问题?
根据离散信号的自协方差系数的标准估计,可用公式表示:
... x(i)
给定信号(即特定的1D向量)在哪里,k
代表x(i)
信号按k
样本移位,N
是x(i)
信号的长度,以及:
...这是简单的平均值,我们可以写:
'''
Calculate the autocovarriance coefficient.
'''
import numpy as np
Xi = np.array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5])
N = np.size(Xi)
k = 5
Xs = np.average(Xi)
def autocovariance(Xi, N, k, Xs):
autoCov = 0
for i in np.arange(0, N-k):
autoCov += ((Xi[i+k])-Xs)*(Xi[i]-Xs)
return (1/(N-1))*autoCov
print("Autocovariance:", autocovariance(Xi, N, k, Xs))
Run Code Online (Sandbox Code Playgroud)
如果你想归一化自协方差系数,它将成为自相关系数,表示为:
...比你只需要在上面的代码中添加两行:
def autocorrelation():
return autocovariance(Xi, N, k, Xs) / autocovariance(Xi, N, 0, Xs)
Run Code Online (Sandbox Code Playgroud)
这是完整的脚本:
'''
Calculate the autocovarriance and autocorrelation coefficients.
'''
import numpy as np
Xi = np.array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5])
N = np.size(Xi)
k = 5
Xs = np.average(Xi)
def autocovariance(Xi, N, k, Xs):
autoCov = 0
for i in np.arange(0, N-k):
autoCov += ((Xi[i+k])-Xs)*(Xi[i]-Xs)
return (1/(N-1))*autoCov
def autocorrelation():
return autocovariance(Xi, N, k, Xs) / autocovariance(Xi, N, 0, Xs)
print("Autocovariance:", autocovariance(Xi, N, k, Xs))
print("Autocorrelation:", autocorrelation())
Run Code Online (Sandbox Code Playgroud)
对之前的答案进行了一个小调整,避免了 pythonfor
循环并使用 numpy 数组操作。如果您有大量数据,这会更快。
def lagged_auto_cov(Xi,t):
"""
for series of values x_i, length N, compute empirical auto-cov with lag t
defined: 1/(N-1) * \sum_{i=0}^{N-t} ( x_i - x_s ) * ( x_{i+t} - x_s )
"""
N = len(Xi)
# use sample mean estimate from whole series
Xs = np.mean(Xi)
# construct copies of series shifted relative to each other,
# with mean subtracted from values
end_padded_series = np.zeros(N+t)
end_padded_series[:N] = Xi - Xs
start_padded_series = np.zeros(N+t)
start_padded_series[t:] = Xi - Xs
auto_cov = 1./(N-1) * np.sum( start_padded_series*end_padded_series )
return auto_cov
Run Code Online (Sandbox Code Playgroud)
将此与@bluevoxel的代码进行比较,使用 50,000 个数据点的时间序列并计算单个固定滞后值的自相关,pythonfor
循环代码平均约为 30 毫秒,使用 numpy 数组平均速度快于 0.3毫秒(在我的笔记本电脑上运行)。
归档时间: |
|
查看次数: |
7145 次 |
最近记录: |