如何在Keras中创建可变长度输入LSTM?

eri*_*rip 37 variable-length python-3.x lstm keras recurrent-neural-network

我正在尝试使用Keras使用LSTM进行一些香草模式识别来预测序列中的下一个元素.

我的数据如下所示:

我的数据

其中训练序列的标签是列表中的最后一个元素:X_train['Sequence'][n][-1].

因为我的Sequence列在序列中可以有可变数量的元素,所以我认为RNN是最好的模型.以下是我在Keras建立LSTM的尝试:

# Build the model

# A few arbitrary constants...
max_features = 20000
out_size = 128

# The max length should be the length of the longest sequence (minus one to account for the label)
max_length = X_train['Sequence'].apply(len).max() - 1

# Normal LSTM model construction with sigmoid activation
model = Sequential()
model.add(Embedding(max_features, out_size, input_length=max_length, dropout=0.2))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2))
model.add(Dense(1))
model.add(Activation('sigmoid'))

# try using different optimizers and different optimizer configs
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
Run Code Online (Sandbox Code Playgroud)

以下是我尝试训练模型的方法:

# Train the model
for seq in X_train['Sequence']:
    print("Length of training is {0}".format(len(seq[:-1])))
    print("Training set is {0}".format(seq[:-1]))
    model.fit(np.array([seq[:-1]]), [seq[-1]])
Run Code Online (Sandbox Code Playgroud)

我的输出是这样的:

Length of training is 13
Training set is [1, 3, 13, 87, 1053, 28576, 2141733, 508147108, 402135275365, 1073376057490373, 9700385489355970183, 298434346895322960005291, 31479360095907908092817694945]
Run Code Online (Sandbox Code Playgroud)

但是,我收到以下错误:

Exception: Error when checking model input: expected embedding_input_1 to have shape (None, 347) but got array with shape (1, 13)
Run Code Online (Sandbox Code Playgroud)

我相信我的训练步骤是正确设置的,因此我的模型构造必定是错误的.注意347是max_length.

如何在Keras中正确构建可变长度输入LSTM?我不想填充数据.不确定它是否相关,但我正在使用Theano后端.

Van*_*Van 18

我不清楚嵌入程序.但仍然是一种实现可变长度输入LSTM的方法.在构建LSTM时,不要指定时间跨度维度.

import keras.backend as K
from keras.layers import LSTM, Input

I = Input(shape=(None, 200)) # unknown timespan, fixed feature size
lstm = LSTM(20)
f = K.function(inputs=[I], outputs=[lstm(I)])

import numpy as np
data1 = np.random.random(size=(1, 100, 200)) # batch_size = 1, timespan = 100
print f([data1])[0].shape
# (1, 20)

data2 = np.random.random(size=(1, 314, 200)) # batch_size = 1, timespan = 314
print f([data2])[0].shape
# (1, 20)
Run Code Online (Sandbox Code Playgroud)

  • 这显示了如何获取函数,但是如何训练它并将其用于预测? (20认同)
  • 无论如何,在训练时,您只能在批量大小 = 1 或固定序列长度的情况下使用可变序列长度,并对较长序列使用裁剪,对较短序列使用填充。 (3认同)
  • 那么如果您想要批量大小大于 1 就不能这样做吗? (2认同)

Sea*_*123 6

训练和分类序列的技巧是使用有状态网络进行屏蔽和分类训练。这是我制作的一个示例,用于对可变长度的序列是否以零开头进行分类。

import numpy as np
np.random.seed(1)

import tensorflow as tf
tf.set_random_seed(1)

from keras import models
from keras.layers import Dense, Masking, LSTM

import matplotlib.pyplot as plt


def stateful_model():
    hidden_units = 256

    model = models.Sequential()
    model.add(LSTM(hidden_units, batch_input_shape=(1, 1, 1), return_sequences=False, stateful=True))
    model.add(Dense(1, activation='relu', name='output'))

    model.compile(loss='binary_crossentropy', optimizer='rmsprop')

    return model


def train_rnn(x_train, y_train, max_len, mask):
    epochs = 10
    batch_size = 200

    vec_dims = 1
    hidden_units = 256
    in_shape = (max_len, vec_dims)

    model = models.Sequential()

    model.add(Masking(mask, name="in_layer", input_shape=in_shape,))
    model.add(LSTM(hidden_units, return_sequences=False))
    model.add(Dense(1, activation='relu', name='output'))

    model.compile(loss='binary_crossentropy', optimizer='rmsprop')

    model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
              validation_split=0.05)

    return model


def gen_train_sig_cls_pair(t_stops, num_examples, mask):
    x = []
    y = []
    max_t = int(np.max(t_stops))

    for t_stop in t_stops:
        one_indices = np.random.choice(a=num_examples, size=num_examples // 2, replace=False)

        sig = np.zeros((num_examples, max_t), dtype=np.int8)
        sig[one_indices, 0] = 1
        sig[:, t_stop:] = mask
        x.append(sig)

        cls = np.zeros(num_examples, dtype=np.bool)
        cls[one_indices] = 1
        y.append(cls)

    return np.concatenate(x, axis=0), np.concatenate(y, axis=0)


def gen_test_sig_cls_pair(t_stops, num_examples):
    x = []
    y = []

    for t_stop in t_stops:
        one_indices = np.random.choice(a=num_examples, size=num_examples // 2, replace=False)

        sig = np.zeros((num_examples, t_stop), dtype=np.bool)
        sig[one_indices, 0] = 1
        x.extend(list(sig))

        cls = np.zeros((num_examples, t_stop), dtype=np.bool)
        cls[one_indices] = 1
        y.extend(list(cls))

    return x, y


if __name__ == '__main__':
    noise_mag = 0.01
    mask_val = -10
    signal_lengths = (10, 15, 20)

    x_in, y_in = gen_train_sig_cls_pair(signal_lengths, 10, mask_val)

    mod = train_rnn(x_in[:, :, None], y_in, int(np.max(signal_lengths)), mask_val)

    testing_dat, expected = gen_test_sig_cls_pair(signal_lengths, 3)

    state_mod = stateful_model()
    state_mod.set_weights(mod.get_weights())

    res = []
    for s_i in range(len(testing_dat)):
        seq_in = list(testing_dat[s_i])
        seq_len = len(seq_in)

        for t_i in range(seq_len):
            res.extend(state_mod.predict(np.array([[[seq_in[t_i]]]])))

        state_mod.reset_states()

    fig, axes = plt.subplots(2)
    axes[0].plot(np.concatenate(testing_dat), label="input")

    axes[1].plot(res, "ro", label="result", alpha=0.2)
    axes[1].plot(np.concatenate(expected, axis=0), "bo", label="expected", alpha=0.2)
    axes[1].legend(bbox_to_anchor=(1.1, 1))

    plt.show()
Run Code Online (Sandbox Code Playgroud)