如何利用外部特征为LSTM构建具有时间序列多步幅度的输入数据?

Yij*_*hen 15 time-series neural-network lstm keras recurrent-neural-network

我正在尝试使用LSTM进行商店销售预测.以下是我的原始数据的样子:

|     Date   | StoreID | Sales | Temperature |  Open   | StoreType |
|------------|---------|-------|-------------|---------|-----------|
| 01/01/2016 |   1     |   0   |      36     |    0    |     1     |
| 01/02/2016 |   1     | 10100 |      42     |    1    |     1     |
| ...
| 12/31/2016 |   1     | 14300 |      39     |    1    |     1     |
| 01/01/2016 |   2     | 25000 |      46     |    1    |     3     |
| 01/02/2016 |   2     | 23700 |      43     |    1    |     3     |
| ...
| 12/31/2016 |   2     | 20600 |      37     |    1    |     3     |
| ...
| 12/31/2016 |   10    | 19800 |      52     |    1    |     2     |
Run Code Online (Sandbox Code Playgroud)

我需要预测未来10天的销售情况.在此示例中,我将需要预测从01-01-2017到01-10-2017的商店销售情况.我知道如何使用其他时间序列模型或回归模型来解决这个问题,但我想知道RNN-LSTM是否适合它.

我开始只使用storeID = 1数据来测试LSTM.如果我的数据只有日期和销售.我将以这种方式构建我的trainX和trainY(如果我错了,请纠正我):

Window = 20
Horizon = 10

|         trainX                  |          trainY              |
| [Yt-10, Yt-11, Yt-12,...,Yt-29] | [Yt, Yt-1, Yt-2,...,Yt-9]    |
| [Yt-11, Yt-12, Yt-13,...,Yt-30] | [Yt-2, Yt-3, Yt-4,...,Yt-10] |
| [Yt-12, Yt-13, Yt-14,...,Yt-31] | [Yt-3, Yt-4, Yt-5,...,Yt-11] |
...
Run Code Online (Sandbox Code Playgroud)

重塑了两个

trainX.shape
(300, 1, 20)
trainY.shape
(300, 10)
Run Code Online (Sandbox Code Playgroud)

问题1:在这种情况下,[样本,时间步长,特征] = [300,1,20].这是正确的吗?或者我应该将样本构建为[300,20,1]?

问题2:我确实想在温度,StoreType等原始数据中使用其他信息.我应该如何构建LSTM的输入数据?

问题3:到目前为止,我们只讨论了1个商店预测,如果我想预测所有商店,那么我应该如何构建我的输入数据呢?

目前我正在从这里流动示例,但似乎还不足以涵盖我所拥有的场景.我真的很感谢你的帮助!

Mar*_*jko 13

我最近解决了类似的问题.在你的情况下:

  1. 输入应该有形状(300, 20, 1)- 因为你有时间序列的长度201功能.

  2. 你可以这样做:

    sequential_input = Input(shape=(20, 1))
    feature_input = Input(shape=(feature_nb,))
    lstm_layer = LSTM(lstm_units_1st_layer, return_sequences=True)(sequential_input)
    lstm_layer = LSTM(lstm_units_2nd_layer, return_sequences=True)(lstm_layer)
    ...
    lstm_layer = LSTM(lstm_units_nth_layer, return_sequences=False)(lstm_layer)
    merged = merge([lstm_layer, feature_input], mode='concat')
    blend = Dense(blending_units_1st_layer, activation='relu')(merged)
    blend = Dense(blending_units_2nd_layer, activation='relu')(blend)
    ...
    output = Dense(10)(blend)
    
    Run Code Online (Sandbox Code Playgroud)
  3. 这是最难的部分.我建议你通过将它们作为一个特征向量馈送到网络来预测多个商店.您可以简单地跳过这一部分并尝试使用一种模型或后处理输出来预测不同的商店,例如使用某种图形模型或PCA在行是日销售的矩阵上.

更新:

为了处理多个顺序功能,您可以执行以下操作:

    sequential_input = Input(shape=(20, nb_of_sequental_features))
    feature_input = Input(shape=(feature_nb,))
    lstm_layer = LSTM(lstm_units_1st_layer, return_sequences=True)(sequential_input)
    lstm_layer = LSTM(lstm_units_2nd_layer, return_sequences=True)(lstm_layer)
    ...
    lstm_layer = LSTM(lstm_units_nth_layer, return_sequences=False)(lstm_layer)
    merged = merge([lstm_layer, feature_input], mode='concat')
    blend = Dense(blending_units_1st_layer, activation='relu')(merged)
    blend = Dense(blending_units_2nd_layer, activation='relu')(blend)
    ...
    output = Dense(10)(blend)
    model = Model(input=[sequential_input, feature_input], output=output])
Run Code Online (Sandbox Code Playgroud)

在这种情况下,您的输入应包含两个表的列表:[sequential_data, features]where sequential_data.shape = (nb_of_examples, timesteps, sequential_features)features.shape = (nb_of_examples, feature_nb).所以salestemperature应该存储在sequential_featuresstore_typefeatures.