线程 SARIMAX 模型中的错误

Question

线程 SARIMAX 模型中的错误

Uas*_*ana 5 python parallel-processing arima

我第一次使用线程库是为了加快我的 SARIMAX 模型的训练时间。但代码不断失败并出现以下错误

Bad direction in the line search; refresh the lbfgs memory and restart the iteration.
This problem is unconstrained.
This problem is unconstrained.
This problem is unconstrained.

Run Code Online (Sandbox Code Playgroud)

以下是我的代码：

import numpy as np
import pandas as pd
from statsmodels.tsa.arima_model import ARIMA
import statsmodels.tsa.api as smt
from threading import Thread

def process_id(ndata):
   train = ndata[0:-7]
   test = ndata[len(train):]
   try:
       model = smt.SARIMAX(train.asfreq(freq='1d'), exog=None, order=(0, 1, 1), seasonal_order=(0, 1, 1, 7)).fit()
       pred = model.get_forecast(len(test))
       fcst = pred.predicted_mean
       fcst.index = test.index
       mapelist = []
       for i in range(len(fcst)):
            mapelist.insert(i, (np.absolute(test[i] - fcst[i])) / test[i])
       mape = np.mean(mapelist) * 100
       print(mape)
    except:
       mape = 0
       pass
return mape

def process_range(ndata, store=None):
   if store is None:
      store = {}
   for id in ndata:
      store[id] = process_id(ndata[id])
   return store


def threaded_process_range(nthreads,ndata):
    store = {}
    threads = []
    # create the threads
    k = 0
    tk = ndata.columns
    for i in range(nthreads):
        dk  = tk[k:len(tk)/nthreads+k]
        k = k+len(tk)/nthreads
        t = Thread(target=process_range, args=(ndata[dk],store))
        threads.append(t)
    [ t.start() for t in threads ]
    [ t.join() for t in threads ]
    return store

outdata = threaded_process_range(4,ndata)

Run Code Online (Sandbox Code Playgroud)

我想提几点：

数据是数据框中的每日股票时间序列
线程适用于 ARIMA 模型
SARIMAX 模型在 for 循环中完成时工作

任何见解将不胜感激谢谢！

Answer 1

Roh*_*mar 7

我在 lbfgs 上遇到了同样的错误，我不确定为什么 lbfgs 无法进行梯度评估，但我尝试更改优化器。你也可以试试这个，选择这些优化器中的任何一个

'newton' 代表 Newton-Raphson，'nm' 代表 Nelder-Mead

Broyden-Fletcher-Goldfarb-Shanno (BFGS) 的“bfgs”

'lbfgs' 用于具有可选框约束的有限内存 BFGS

'powell' 表示修改后的鲍威尔方法

'cg' 表示共轭梯度

'ncg' 表示牛顿共轭梯度

'basinhopping' 用于全球盆地跳跃求解器

在你的代码中改变这一点

model = smt.SARIMAX(train.asfreq(freq='1d'), exog=None, order=(0, 1, 1), seasonal_order=(0, 1, 1, 7)).fit(method='cg')

这是一个老问题，但我仍在回答，以防将来有人面临同样的问题。

归档时间：	8 年，9 月前
查看次数：	1617 次
最近记录：	7 年，1 月前