使用基于ConvLSTM2D的Keras模型估算来自较低分辨率图像的高分辨率图像

hik*_*ker 6 scikit-learn keras tensorflow

我正在尝试使用以下ConvLSTM2D架构来估计低分辨率图像序列的高分辨率图像序列:

import numpy as np, scipy.ndimage, matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, ConvLSTM2D, MaxPooling2D, UpSampling2D
from sklearn.metrics import accuracy_score, confusion_matrix, cohen_kappa_score
from sklearn.preprocessing import MinMaxScaler, StandardScaler
np.random.seed(123)

raw = np.arange(96).reshape(8,3,4)
data1 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=1, mode='nearest') #low res
print (data1.shape)
#(8, 300, 400)

data2 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=3, mode='nearest') #high res
print (data2.shape)
#(8, 300, 400)

X_train = data1.reshape(data1.shape[0], 1, data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(data2.shape[0], 1, data2.shape[1], data2.shape[2], 1)
#(samples,time, rows, cols, channels)

model = Sequential()
input_shape = (data1.shape[0], data1.shape[1], data1.shape[2], 1)
#samples, time, rows, cols, channels
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape))     
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))

print (model.summary())

model.compile(loss='mean_squared_error',
              optimizer='adam',
              metrics=['accuracy'])

model.fit(X_train, Y_train, 
          batch_size=1, epochs=10, verbose=1)

x,y = model.evaluate(X_train, Y_train, verbose=0)
print (x,y)
Run Code Online (Sandbox Code Playgroud)

此声明将导致以下Value错误:

ValueError:输入0与图层conv_lst_m2d_2不兼容:预期ndim = 5,发现ndim = 4

我怎么能纠正这个ValueError?我认为问题在于输入形状,但无法弄清楚到底出了什么问题.
请注意,输出也应该是图像序列,而不是分类结果.

lda*_*vid 3

发生这种情况是因为LSTMs需要时态数据,但您的第一个被声明为many-to-one模型,它输出形状为 的张量(batch_size, 300, 400, 16)。即批量图像:

model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape))     
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
Run Code Online (Sandbox Code Playgroud)

您希望输出是形状张量(batch_size, 8, 300, 400, 16)(即图像序列),以便第二个 LSTM 可以使用它们。解决这个问题的方法是添加return_sequences第一个 LSTM 定义:

model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape,
                     return_sequences=True))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
Run Code Online (Sandbox Code Playgroud)

你提到了分类。如果你缩进的是对整个序列进行分类,那么你在最后需要一个分类器:

model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape,
                     return_sequences=True))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
model.add(GlobalAveragePooling2D())
model.add(Dense(10, activation='softmax'))  # output shape: (None, 10)
Run Code Online (Sandbox Code Playgroud)

但是,如果您尝试对序列中的TimeDistributed每个图像进行分类,那么您可以简单地使用以下方法重新应用分类器:

x = Input(shape=(300, 400, 8))
y = GlobalAveragePooling2D()(x)
y = Dense(10, activation='softmax')(y)
classifier = Model(inputs=x, outputs=y)

x = Input(shape=(data1.shape[0], data1.shape[1], data1.shape[2], 1))
y = ConvLSTM2D(16, kernel_size=(3, 3),
               activation='sigmoid',
               padding='same',
               return_sequences=True)(x)
y = ConvLSTM2D(8, kernel_size=(3, 3),
               activation='sigmoid',
               padding='same',
               return_sequences=True)(y)
y = TimeDistributed(classifier)(y)  # output shape: (None, 8, 10)

model = Model(inputs=x, outputs=y)
Run Code Online (Sandbox Code Playgroud)

最后,看一下 keras 存储库中的示例。有一个使用 ConvLSTM2D 的生成模型


编辑:从 data1 估计 data2...

如果这次我没猜错的话,X_train应该是 8 个 (300, 400, 1) 图像堆栈的 1 个样本,而不是 1 个形状 (300, 400, 1) 图像堆栈的 8 个样本。
如果这是真的,那么:

X_train = data1.reshape(data1.shape[0], 1, data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(data2.shape[0], 1, data2.shape[1], data2.shape[2], 1)
Run Code Online (Sandbox Code Playgroud)

应更新为:

X_train = data1.reshape(1, data1.shape[0], data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(1, data2.shape[0], data2.shape[1], data2.shape[2], 1)
Run Code Online (Sandbox Code Playgroud)

另外,accuracy当你的损失是 mse 时,通常没有意义。您可以使用其他指标,例如mae.

现在您只需要更新模型以返回序列并在最后一层中有一个单元(因为您尝试估计的图像具有单个通道):

model = Sequential()
input_shape = (data1.shape[0], data1.shape[1], data1.shape[2], 1)
model.add(ConvLSTM2D(16, kernel_size=(3, 3), activation='sigmoid', padding='same',
                     input_shape=input_shape,
                     return_sequences=True))
model.add(ConvLSTM2D(1, kernel_size=(3, 3), activation='sigmoid', padding='same',
                     return_sequences=True))

model.compile(loss='mse', optimizer='adam')
Run Code Online (Sandbox Code Playgroud)

之后,model.fit(X_train, Y_train, ...)将开始正常训练:

Using TensorFlow backend.
(8, 300, 400)
(8, 300, 400)
Epoch 1/10

1/1 [==============================] - 5s 5s/step - loss: 2993.8701
Epoch 2/10

1/1 [==============================] - 5s 5s/step - loss: 2992.4492
Epoch 3/10

1/1 [==============================] - 5s 5s/step - loss: 2991.4536
Epoch 4/10

1/1 [==============================] - 5s 5s/step - loss: 2989.8523
Run Code Online (Sandbox Code Playgroud)