And*_*ndy 7 python machine-learning lstm keras recurrent-neural-network
给定具有维度的X (m个样本,n个序列和k个特征),以及具有维度的y个标签(m个样本,0/1):
假设我想训练一个有状态的LSTM(通过keras定义,其中"stateful = True"意味着每个样本的序列之间没有重置单元状态 - 如果我错了请纠正我!),是否应该重置状态在每个时期或每个样本的基础上?
例:
for e in epoch:
for m in X.shape[0]: #for each sample
for n in X.shape[1]: #for each sequence
#train_on_batch for model...
#model.reset_states() (1) I believe this is 'stateful = False'?
#model.reset_states() (2) wouldn't this make more sense?
#model.reset_states() (3) This is what I usually see...
Run Code Online (Sandbox Code Playgroud)
总之,我不确定是否在每个序列或每个时期之后重置状态(在所有m个样本都在X中训练之后).
建议非常感谢.
如果使用stateful=True,通常会在每个纪元或每两个样本结束时重置状态.如果要在每个样本后重置状态,那么这相当于只使用stateful=False.
关于你提供的循环:
for e in epoch:
for m in X.shape[0]: #for each sample
for n in X.shape[1]: #for each sequence
Run Code Online (Sandbox Code Playgroud)
请注意,尺寸X不完全正确
(m samples, n sequences, k features)
Run Code Online (Sandbox Code Playgroud)
实际上是维度
(batch size, number of timesteps, number of features)
Run Code Online (Sandbox Code Playgroud)
因此,你不应该有内循环:
for n in X.shape[1]
Run Code Online (Sandbox Code Playgroud)
现在,关于循环
for m in X.shape[0]
Run Code Online (Sandbox Code Playgroud)
由于批量枚举是在keras中自动完成的,因此您也不必实现此循环(除非您希望每两个样本重置一次状态).因此,如果您只想在每个纪元的末尾重置,则只需要外部循环.
以下是此类架构的示例(摘自此博客文章):
batch_size = 1
model = Sequential()
model.add(LSTM(16, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
for i in range(300):
model.fit(X, y, epochs=1, batch_size=batch_size, verbose=2, shuffle=False)
model.reset_states()
Run Code Online (Sandbox Code Playgroud)