如何在 Keras 中为有状态 LSTM 准备数据?

Mys*_*Guy 2 python lstm keras

我想开发一种用于二元分类的时间序列方法,在 Keras 中使用有状态的 LSTM

这是我的数据的外观。我得到了很多,比如说N,录音。每个记录包含 22 个长度的时间序列M_i(i=1,...N)。我想在 Keras 中使用有状态模型,但我不知道如何重塑我的数据,尤其是我应该如何定义我的batch_size.

这是我如何进行statelessLSTM。我look_back为所有录音创建了长度序列,以便我拥有大小数据(N*(M_i-look_back), look_back, 22=n_features)

这是我为此目的使用的功能:

def create_dataset(feat,targ, look_back=1):
    dataX, dataY = [], []
#     print (len(targ)-look_back-1)
    for i in range(len(targ)-look_back):
        a = feat[i:(i+look_back), :]
        dataX.append(a)
        dataY.append(targ[i + look_back-1])
    return np.array(dataX), np.array(dataY)
Run Code Online (Sandbox Code Playgroud)

其中feat是大小的二维数据数组(n_samples, n_features)(对于每个记录),targ是目标向量。

所以,我的问题是,根据上面解释的数据,如何为有状态模型重塑数据并考虑批处理概念?有什么预防措施吗?

我想要做的是能够将每个录音的每个 time_step 分类为癫痫发作/非癫痫发作。

编辑:我想到的另一个问题是:我的录音包含不同长度的序列。我的有状态模型可以学习每个记录的长期依赖关系,这意味着 batch_size 从一个记录到另一个记录不同......如何处理?在完全不同的序列(test_set)上进行测试时会不会导致泛化问题?

谢谢

Dan*_*ler 7

I don't think you need a stateful layer for your purpose.

If you want long term learning, simply don't create these sliding windows. Have your data shaped as:

(number_of_independent_sequences, length_or_steps_of_a_sequence, variables_or_features_per_step)
Run Code Online (Sandbox Code Playgroud)

I'm not sure I understand the wording correctly in your question. If a "recording" is like a "movie" or a "song", a "voice clip" or something like that, then:

  • number of sequences = number of recordings

Following that idea of "recording", the time steps will be "the frames in a vide", or the "samples" (time x sample_rate for 1 channel) in an audio file. (Be careful, "samples" in keras are "sequences/recordings" while "samples" in audio processing are "steps" in keras).

  • time_steps = number of frames or audio samples

Finally, the number of features/variables. In a movie, it's like RGB channels (3 features), in audio, also the number of channels (2 in stereo). In other kinds of data they may be temperature, pressure, etc.

  • features = number of variables measured in each step

Having your data shaped like this will work for both stateful = True and False.

These two methods of training are equivalent:

#with stateful=False
model.fit(X, Y, batch_size=batch_size)

#with stateful=True
for start in range(0, len(X), batch_size):
    model.train_on_batch(X[start:start+batch_size], Y[start:start+batch_size])
    model.reset_states()
Run Code Online (Sandbox Code Playgroud)

There might be changes only in the way the optimizers are updated.

For your case, if you can create such input data shaped as mentioned and you're not going to recursively predict the future, I don't see a reason to use stateful=True.

Classifying every step

For classifying every step, you don't need to create sliding windows, it's also not necessary to use stateful=True.

Recurrent layers have an option to output all time steps, by setting return_sequences=True.

If you have an input with shape (batch, steps, features), you will need targets with shape (batch, steps, 1), which is one class per step.

In short, you need:

  • LSTM layers with return_sequences=True
  • X_train with shape (files, total_eeg_length, 22)
  • Y_train with shape (files, total_eeg_length, 1)

Hint: as LSTMs never classify the beginning very well, you can try using Bidirectional(LSTM(....)) layers.

Inputs with different lengths

For using inputs with different lengths, you need to set input_shape=(None, features). Considering our discussion in the chat, features = 22.

You can then:

  • Load each EEG individually:

    • X_train as (1, eeg_length, 22)
    • Y_train as (1, eeg_length, 1)
    • Train each EEG separately with model.train_on_batch(array, targets).
    • You will need to manage epochs manually and use test_on_batch for validation data.
  • Pad the shorter EEGs with zeros or another dummy value until they all reach the max_eeg_length and use:

    • a Masking layer at the beginning of the model to discard the steps with the dummy value.
    • X_train as (eegs, max_eeg_length, 22)
    • Y_train as (eegs, max_eeg_length, 1)
    • You can train with a regular model.fit(X_train, Y_train,...)