如何使用Keras/Theano for Regression配置一个非常简单的LSTM

The*_* H. 7 regression theano lstm keras

我正在努力为简单的回归任务配置Keras LSTM.官方页面上有一些非常基本的解释:Keras RNN文档

但要完全理解,带有示例数据的示例配置将非常有用.

我几乎没有找到使用Keras-LSTM进行回归的示例.大多数示例都是关于分类(文本或图像).我研究了Keras发行版附带的LSTM示例和我通过Google搜索找到的一个示例:http://danielhnyk.cz/它提供了一些见解,尽管作者承认这种方法的内存效率非常低,因为数据样本必须非常冗余地存储.

虽然评论员(Taha)引入了一项改进,但数据存储仍然是多余的,我怀疑这是Keras开发人员的意图.

我已经下载了一些简单的示例顺序数据,这些数据恰好是来自雅虎财经的股票数据.它可以从雅虎财经数据免费获得

Date,       Open,      High,      Low,       Close,     Volume,   Adj Close
2016-05-18, 94.160004, 95.209999, 93.889999, 94.559998, 41923100, 94.559998
2016-05-17, 94.550003, 94.699997, 93.010002, 93.489998, 46507400, 93.489998
2016-05-16, 92.389999, 94.389999, 91.650002, 93.879997, 61140600, 93.879997
2016-05-13, 90.00,     91.669998, 90.00,     90.519997, 44188200, 90.519997
Run Code Online (Sandbox Code Playgroud)

该表包含8900多条此类Apple股票数据.每天有7列=数据点.要预测的值是"AdjClose",这是一天结束时的值

因此,目标是根据前几天的顺序预测第二天的AdjClose.(这可能几乎是不可能的,但总是很高兴看到工具在具有挑战性的条件下如何表现.)

我认为这应该是LSTM非常标准的预测/回归情况,并且可以轻松转移到其他问题域.

那么,如何格式化数据(X_train,y_train)以实现最小冗余,以及如何仅使用一个LSTM层和几个隐藏神经元来初始化Sequential模型?

亲切的问候,西奥

PS:我开始编码:

...
X_train
Out[6]: 
array([[  2.87500000e+01,   2.88750000e+01,   2.87500000e+01,
      2.87500000e+01,   1.17258400e+08,   4.31358010e-01],
   [  2.73750019e+01,   2.73750019e+01,   2.72500000e+01,
      2.72500000e+01,   4.39712000e+07,   4.08852011e-01],
   [  2.53750000e+01,   2.53750000e+01,   2.52500000e+01,
      2.52500000e+01,   2.64320000e+07,   3.78845006e-01],
   ..., 
   [  9.23899994e+01,   9.43899994e+01,   9.16500015e+01,
      9.38799973e+01,   6.11406000e+07,   9.38799973e+01],
   [  9.45500031e+01,   9.46999969e+01,   9.30100021e+01,
      9.34899979e+01,   4.65074000e+07,   9.34899979e+01],
   [  9.41600037e+01,   9.52099991e+01,   9.38899994e+01,
      9.45599976e+01,   4.19231000e+07,   9.45599976e+01]], dtype=float32)

y_train
Out[7]: 
array([  0.40885201,   0.37884501,   0.38822201, ...,  93.87999725,
   93.48999786,  94.55999756], dtype=float32)
Run Code Online (Sandbox Code Playgroud)

到目前为止,数据准备就绪.没有引入冗余.现在的问题是,如何描述这个数据的Keras LSTM模型/培训过程.

编辑3:

以下是具有循环网络所需的3D数据结构的更新代码.(见Lorrit的回答).但它不起作用.

编辑4:在激活('sigmoid')后删除额外的逗号,以正确的方式塑造Y_train.还是一样的错误.

import numpy as np

from keras.models import Sequential
from keras.layers import Dense,  Activation, LSTM

nb_timesteps    =  4
nb_features     =  5
batch_size      = 32

# load file
X_train = np.genfromtxt('table.csv', 
                        delimiter=',',  
                        names=None, 
                        unpack=False,
                        dtype=None)

# delete the first row with the names
X_train = np.delete(X_train, (0), axis=0)

# invert the order of the rows, so that the oldest
# entry is in the first row and the newest entry
# comes last
X_train = np.flipud(X_train)

# the last column is our Y
Y_train = X_train[:,6].astype(np.float32)

Y_train = np.delete(Y_train, range(0,6))
Y_train = np.array(Y_train)
Y_train.shape = (len(Y_train), 1)

# we don't use the timestamps. convert the rest to Float32
X_train = X_train[:, 1:6].astype(np.float32)

# shape X_train
X_train.shape = (1,len(X_train), nb_features)


# Now comes Lorrit's code for shaping the 3D-input-data
# http://stackoverflow.com/questions/36992855/keras-how-should-i-prepare-input-data-for-rnn
flag = 0

for sample in range(X_train.shape[0]):
    tmp = np.array([X_train[sample,i:i+nb_timesteps,:] for i in range(X_train.shape[1] - nb_timesteps + 1)])

    if flag==0:
        new_input = tmp
        flag = 1

    else:
        new_input = np.concatenate((new_input,tmp))

X_train = np.delete(new_input, len(new_input) - 1, axis = 0)
X_train = np.delete(X_train, 0, axis = 0)
X_train = np.delete(X_train, 0, axis = 0)
# X successfully shaped

# free some memory
tmp = None
new_input = None


# split data for training, validation and test
# 50:25:25
X_train, X_test = np.split(X_train, 2, axis=0)
X_valid, X_test = np.split(X_test, 2, axis=0)

Y_train, Y_test = np.split(Y_train, 2, axis=0)
Y_valid, Y_test = np.split(Y_test, 2, axis=0)


print('Build model...')

model = Sequential([
    Dense(8, input_dim=nb_features),
    Activation('softmax'),
    LSTM(4, dropout_W=0.2, dropout_U=0.2),
    Dense(1),
    Activation('sigmoid')
])

model.compile(loss='mse',
              optimizer='RMSprop',
              metrics=['accuracy'])

print('Train...')
print(X_train.shape)
print(Y_train.shape)
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=15,
          validation_data=(X_test, Y_test))
score, acc = model.evaluate(X_test, Y_test,
                            batch_size=batch_size)

print('Test score:', score)
print('Test accuracy:', acc)
Run Code Online (Sandbox Code Playgroud)

Keras说,数据似乎仍存在问题:

Using Theano backend.
Using gpu device 0: GeForce GTX 960 (CNMeM is disabled, cuDNN not available)Build model...

Traceback (most recent call last):

  File "<ipython-input-1-3a6e9e045167>", line 1, in <module>
    runfile('C:/Users/admin/Documents/pycode/lstm/lstm5.py', wdir='C:/Users/admin/Documents/pycode/lstm')

  File "C:\Users\admin\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 699, in runfile
    execfile(filename, namespace)

  File "C:\Users\admin\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "C:/Users/admin/Documents/pycode/lstm/lstm5.py", line 79, in <module>
    Activation('sigmoid')

  File "d:\git\keras\keras\models.py", line 93, in __init__
    self.add(layer)

  File "d:\git\keras\keras\models.py", line 146, in add
    output_tensor = layer(self.outputs[0])

  File "d:\git\keras\keras\engine\topology.py", line 441, in __call__
    self.assert_input_compatibility(x)

  File "d:\git\keras\keras\engine\topology.py", line 382, in assert_input_compatibility
    str(K.ndim(x)))

Exception: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=2
Run Code Online (Sandbox Code Playgroud)

Lor*_*rit 1

在将数据输入 LSTM 之前,您仍然缺少一个预处理步骤。您必须决定要在当天 AdjClose 的计算中包含多少个先前数据样本(前几天)。请参阅我的回答,了解如何做到这一点。然后,您的数据应该是 3 维形状(nb_samples、nb_included_previous_days、features)。

然后,您可以将 3D 数据输入具有一个输出的标准 LSTM 层。您可以将该值与 y_train 进行比较并尝试最小化误差。请记住选择适合回归的损失函数,例如均方误差。