Keras - 将3通道图像输入LSTM

Question

Keras - 将3通道图像输入LSTM

shu*_*ngh 9 python lstm keras recurrent-neural-network

我已经将一系列图像读入一个具有形状的numpy数组,(7338, 225, 1024, 3)其中7338是样本大小,225是时间步长,1024 (32x32)是3通道(RGB)中的平坦图像像素.

我有一个带LSTM层的顺序模型:

model = Sequential()
model.add(LSTM(128, input_shape=(225, 1024, 3))

Run Code Online (Sandbox Code Playgroud)

但这会导致错误:

Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4

Run Code Online (Sandbox Code Playgroud)

该文件提到,对于LSTM层输入张量应该是3D tensor with shape (batch_size, timesteps, input_dim),但对我来说我input_dim是2D的.

在Keras中将3通道图像输入LSTM层的建议方法是什么？

Answer 1

Dan*_*ler 8

如果您希望图像数量为序列(如带有帧的影片),则需要将像素和通道作为特征:

input_shape = (225,3072)  #a 3D input where the batch size 7338 wasn't informed

Run Code Online (Sandbox Code Playgroud)

如果在将3072个特征投射到LSTM之前需要更多处理,则可以组合或交错2D卷积和LSTM以获得更精细的模型(不一定更好,但每个应用程序都有其特定的行为).

您也可以尝试使用新的ConvLSTM2D,它将采用五维输入:

input_shape=(225,32,32,3) #a 5D input where the batch size 7338 wasn't informed

Run Code Online (Sandbox Code Playgroud)

我可能会创建一个卷积网与几个TimeDistributed(Conv2D(...)),TimeDistributed(MaxPooling2D(...))然后添加一个TimeDistributed(Flatten()),最后是LSTM().这很可能会提高您对图像的理解和LSTM的性能.

归档时间：	7 年，11 月前
查看次数：	6639 次
最近记录：	7 年，11 月前