替代r的python/scikit/numpy中的指数平滑状态空间模型

Dee*_*k m 18 python numpy scipy scikit-learn

在R中我们有一个很好的预测模型,如:

ets(y, model="ZZZ", damped=NULL, alpha=NULL, beta=NULL, gamma=NULL, 

phi=NULL, additive.only=FALSE, lambda=NULL, 

lower=c(rep(0.0001,3), 0.8), upper=c(rep(0.9999,3),0.98), 

opt.crit=c("lik","amse","mse","sigma","mae"), nmse=3, 

bounds=c("both","usual","admissible"), ic=c("aicc","aic","bic"),

restrict=TRUE, allow.multiplicative.trend=FALSE, use.initial.values=FALSE, ...)
Run Code Online (Sandbox Code Playgroud)

在这种方法中,如果我们分配任何变量,它会自动获得季节类型,趋势和错误类型, model="ZZZ"/"AMA"/"MMZ"并且会自动调整一些因子以获得准确的结果.

  • 在python中,我们ets在pandas/numpy/scipy/scikit中有类似的东西吗?

    根据我的研究:
    Ewma在熊猫中类似,但我们需要将所有参数硬编码为固定参数.
    在Holtwinter,我们需要为所有趋势和季节类型编写详细的方法.

  • 因此,我们有没有任何现成的函数,它们将数据帧作为输入并提供预测值,而无需为参数自己编写任何内部函数?

  • 任何微调回归模型scikit/statsmodels?

Joh*_*fis 7

经过一番搜索,我没有发现任何看起来很有希望作为etspython 的替代品.虽然有一些尝试:StatsModelspycast的预测方法,您可以检查它们是否适合您的需求.

可用于解决缺少的实现的一个选项是使用子进程模块从python运行R脚本.这里是如何做到这一点很好的文章在这里.

为了做到以后:

  1. 您需要创建一个R脚本(例如my_forecast.R),它将 计算(使用ets)并打印文件上的预测,或者stdout(使用cat()命令),以便在脚本运行后使用它们.
  2. 您可以从python脚本运行R脚本,如下所示:

    import subprocess
    
    # You need to define the command that will run the Rscript from the subprocess
    command = 'Rscript'
    path2script = 'path/to/my_forecast.R'
    cmd = [command, path2script]
    
    # Option 1: If your script prints to a file
    subprocess.run(cmd)
    f = open('path/to/created/file', 'r')
    (...Do stuff from here...)
    
    # Option 2: If your script prints to stdout
    forecasts = subprocess.check_output(cmd, universal_newlines=True)
    (...Do stuff from here...)
    
    Run Code Online (Sandbox Code Playgroud)

    您还可以为您的参数添加参数,您cmd的Rscript将将其用作命令行参数,如下所示:

    args = [arg0, arg1, ...]
    
    cmd = [command, path2script] + args 
    Then pass cmd to the subprocess
    
    Run Code Online (Sandbox Code Playgroud)

编辑:

我发现了一个示范一系列的霍尔特-温特斯预测文章:第一部分,第2部分第三部分.除了这些文章中易于理解的分析外,Gregory Trubetskoy(作者)提供了他开发的代码:

初步趋势:

def initial_trend(series, slen):
    sum = 0.0
    for i in range(slen):
        sum += float(series[i+slen] - series[i]) / slen
    return sum / slen

# >>> initial_trend(series, 12)
# -0.7847222222222222
Run Code Online (Sandbox Code Playgroud)

最初的季节性成分:

def initial_seasonal_components(series, slen):
    seasonals = {}
    season_averages = []
    n_seasons = int(len(series)/slen)
    # compute season averages
    for j in range(n_seasons):
        season_averages.append(sum(series[slen*j:slen*j+slen])/float(slen))
    # compute initial values
    for i in range(slen):
        sum_of_vals_over_avg = 0.0
        for j in range(n_seasons):
            sum_of_vals_over_avg += series[slen*j+i]-season_averages[j]
        seasonals[i] = sum_of_vals_over_avg/n_seasons
    return seasonals

# >>> initial_seasonal_components(series, 12)
# {0: -7.4305555555555545, 1: -15.097222222222221, 2: -7.263888888888888,
#  3: -5.097222222222222,  4: 3.402777777777778,   5: 8.069444444444445,  
#  6: 16.569444444444446,  7: 9.736111111111112,   8: -0.7638888888888887,
#  9: 1.902777777777778,  10: -3.263888888888889, 11: -0.7638888888888887}
Run Code Online (Sandbox Code Playgroud)

最后算法:

def triple_exponential_smoothing(series, slen, alpha, beta, gamma, n_preds):
    result = []
    seasonals = initial_seasonal_components(series, slen)
    for i in range(len(series)+n_preds):
        if i == 0: # initial values
            smooth = series[0]
            trend = initial_trend(series, slen)
            result.append(series[0])
            continue
        if i >= len(series): # we are forecasting
            m = i - len(series) + 1
            result.append((smooth + m*trend) + seasonals[i%slen])
        else:
            val = series[i]
            last_smooth, smooth = smooth, alpha*(val-seasonals[i%slen]) + (1-alpha)*(smooth+trend)
            trend = beta * (smooth-last_smooth) + (1-beta)*trend
            seasonals[i%slen] = gamma*(val-smooth) + (1-gamma)*seasonals[i%slen]
            result.append(smooth+trend+seasonals[i%slen])
    return result

# # forecast 24 points (i.e. two seasons)
# >>> triple_exponential_smoothing(series, 12, 0.716, 0.029, 0.993, 24)
# [30, 20.34449316666667, 28.410051892109554, 30.438122252647577, 39.466817731253066, ...
Run Code Online (Sandbox Code Playgroud)

您可以将它们放在一个文件中,例如:holtwinters.py在具有以下结构的文件夹中:

forecast_folder
|
??? __init__.py
|
??? holtwinters.py
Run Code Online (Sandbox Code Playgroud)

从这里开始,这是一个python模块,您可以将其置于所需的每个项目结构中,并在该项目内的任何位置使用它,只需导入它即可.

  • 我给你赏金,因为你付出了很多努力来回答这个问题,但这仍然不是我想要的.如果将来有人发现(或创建)所有30个Rob Hyndmans状态空间指数平滑模型,我将很乐意制作并奖励另一个赏金:https://www.otexts.org/fpp/7/ 7 (2认同)

Lit*_*les 5

作为Statsmodels最新版本的一部分,现在可以在python中使用ETS 。这是其用法的一个简单示例:

from statsmodels.tsa.api import ExponentialSmoothing

# make our own periodic data
# 3 weeks of hourly data
m = 24
days = 21
x = np.array(range(days * m)) / (m / 2 / np.pi)
st = 4 * np.sin(x)
lt = np.linspace(0, 15, days * m)
bt = 100
e = np.random.normal(scale=0.5, size=x.shape)

# make the ts we wish to forecast
y = lt + st + bt + e

# our guessed parameters
alpha = 0.4
beta = 0.2
gamma = 0.01

# initialise model
ets_model = ExponentialSmoothing(y, trend='add', seasonal='add', 
seasonal_periods=24)
ets_fit = ets_model.fit(smoothing_level=alpha, smoothing_slope=beta,
smoothing_seasonal=gamma)

# forecast p hours ahead
p_ahead = 48
yh = ets_fit.forecast(p_ahead)

# plot the y, y_smoothed and y_hat ts'
plt.plot(y, label='y')
plt.plot(ets_fit.fittedvalues, label='y_smooth')
plt.plot(range(days * m, days * m + p_ahead),yh, label='y_hat')

plt.legend()
plt.show()
Run Code Online (Sandbox Code Playgroud)

在此处输入图片说明

这是笔记本形式的更多示例

最后,这是源代码,如果您想看一下的话。