Python平滑时间序列数据

Kyl*_*ndt 13 python time-series

我在python中有一些unixtime的数据,值:

[(1301672429, 274), (1301672430, 302), (1301672431, 288)...]
Run Code Online (Sandbox Code Playgroud)

时间不断地步进一秒钟.我如何减少这些数据,以便时间戳每秒,但该值是周围10个值的平均值?

Fancier滚动平均值也会很好,但是这些数据是绘制的,因此主要是为了平滑图表.

跟进(TSQL滚动平均时间分组后得出结论,在SQL中尝试这样做是一种痛苦的路径).

Kyl*_*ndt 16

使用http://www.scipy.org/Cookbook/SignalSmooth:

import numpy
def smooth(x,window_len=11,window='hanning'):
        if x.ndim != 1:
                raise ValueError, "smooth only accepts 1 dimension arrays."
        if x.size < window_len:
                raise ValueError, "Input vector needs to be bigger than window size."
        if window_len<3:
                return x
        if not window in ['flat', 'hanning', 'hamming', 'bartlett', 'blackman']:
                raise ValueError, "Window is on of 'flat', 'hanning', 'hamming', 'bartlett', 'blackman'"
        s=numpy.r_[2*x[0]-x[window_len-1::-1],x,2*x[-1]-x[-1:-window_len:-1]]
        if window == 'flat': #moving average
                w=numpy.ones(window_len,'d')
        else:  
                w=eval('numpy.'+window+'(window_len)')
        y=numpy.convolve(w/w.sum(),s,mode='same')
        return y[window_len:-window_len+1]
Run Code Online (Sandbox Code Playgroud)

我得到的结果似乎很好(不是我理解数学):

   if form_results['smooth']:
            a = numpy.array([x[1] for x in results])
            smoothed = smooth(a,window_len=21)
            results = zip([x[0] for x in results], smoothed)
Run Code Online (Sandbox Code Playgroud)

  • 这看似合理.如果你想要平均值,那么你的窗口应该是"平坦的".其他窗口协议以不同方式对窗口中的数据点进行加权. (2认同)

Jos*_*del 11

如果您有权访问numpy,可以试试这个食谱:

http://www.scipy.org/Cookbook/SignalSmooth