Keras TimeDistributed未屏蔽CNN模型

Question

Keras TimeDistributed未屏蔽CNN模型

Ale*_* R. 8 deep-learning conv-neural-network keras tensorflow

为了举例,我有一个由2个图像组成的输入,总形状(2,299,299,3).我正在尝试在每个图像上应用inceptionv3,然后使用LSTM处理输出.我正在使用遮罩层来排除处理空白图像(在下面指定).

代码是:

import numpy as np
from keras import backend as K
from keras.models import Sequential,Model
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D, BatchNormalization, \
Input, GlobalAveragePooling2D, Masking,TimeDistributed, LSTM,Dense,Flatten,Reshape,Lambda, Concatenate
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.applications import inception_v3

IMG_SIZE=(299,299,3)
def create_base():
    base_model = inception_v3.InceptionV3(weights='imagenet', include_top=False)
    x = GlobalAveragePooling2D()(base_model.output)
    base_model=Model(base_model.input,x)
    return base_model


base_model=create_base()

#Image mask to ignore images with pixel values of -1
IMAGE_MASK = -2*np.expand_dims(np.ones(IMG_SIZE),0)

final_input=Input((2,IMG_SIZE[0],IMG_SIZE[1],IMG_SIZE[2]))

final_model = Masking(mask_value = -2.)(final_input)
final_model = TimeDistributed(base_model)(final_model)
final_model = Lambda(lambda x: x, output_shape=lambda s:s)(final_model)
#final_model = Reshape(target_shape=(2, 2048))(final_model)
#final_model = Masking(mask_value = 0.)(final_model)
final_model = LSTM(5,return_sequences=False)(final_model)
final_model = Model(final_input,final_model)


#Create a sample test image
TEST_IMAGE = np.ones(IMG_SIZE)

#Create a test sample input, consisting of a normal image and a masked image
TEST_SAMPLE = np.concatenate((np.expand_dims(TEST_IMAGE,axis=0),IMAGE_MASK))



inp = final_model.input                                           # input placeholder
outputs = [layer.output for layer in final_model.layers]          # all layer outputs
functors = [K.function([inp]+ [K.learning_phase()], [out]) for out in outputs]
layer_outs = [func([np.expand_dims(TEST_SAMPLE,0), 1.]) for func in functors]

Run Code Online (Sandbox Code Playgroud)

这不能正常工作.具体来说,模型应该屏蔽输入的IMAGE_MASK部分,但它会在初始时处理它(给出非零输出).这是详细信息:

layer_out [-1],LSTM输出正常:

[array([[-0.15324114, -0.09620268, -0.01668587, 0.07938149, -0.00757846]], dtype=float32)]

layer_out [-2]和layer_out [-3],LSTM输入错误,它应该在第二个数组中全部为零:

[array([[[ 0.37713543, 0.36381325, 0.36197218, ..., 0.23298527, 0.43247852, 0.34844452], [ 0.24972123, 0.2378867 , 0.11810347, ..., 0.51930511, 0.33289322, 0.33403745]]], dtype=float32)]

layer_out [-4],CNN的输入被正确掩盖:

[[ 1.,  1.,  1.],
           [ 1.,  1.,  1.],
           [ 1.,  1.,  1.],
           ..., 
           [ 1.,  1.,  1.],
           [ 1.,  1.,  1.],
           [ 1.,  1.,  1.]]],


         [[[-0., -0., -0.],
           [-0., -0., -0.],
           [-0., -0., -0.],
           ..., 
           [-0., -0., -0.],
           [-0., -0., -0.],
           [-0., -0., -0.]],

Run Code Online (Sandbox Code Playgroud)

请注意,代码似乎可以使用更简单的base_model 正常工作,例如:

def create_base():
    input_layer=Input(IMG_SIZE)
    base_model=Flatten()(input_layer)
    base_model=Dense(2048)(base_model)
    base_model=Model(input_layer,base_model)
    return base_model

Run Code Online (Sandbox Code Playgroud)

我已经用尽了大部分在线资源.已经在Keras的github上询问了这个问题的排列,例如这里,这里和这里,但我似乎找不到任何具体的解决方案.

这些链接表明这些问题似乎源于TimeDistributed应用于BatchNormalization,以及Lambda身份层的hacky修复,或Reshape图层删除错误,但似乎没有输出正确的模型.

我试图通过以下方式强制基本模型支持屏蔽:

base_model.__setattr__('supports_masking',True)

Run Code Online (Sandbox Code Playgroud)

我也尝试通过以下方式应用身份层:

TimeDistributed(Lambda(lambda x: base_model(x), output_shape=lambda s:s))(final_model)

Run Code Online (Sandbox Code Playgroud)

但这些似乎都不起作用.请注意,我希望最终模型可以训练,特别是它的CNN部分应该是可训练的.

Answer 1

Dan*_*ler 3

不完全确定这会起作用，但根据此处的评论，使用较新版本的tensorflow + keras它应该可以工作：

final_model = TimeDistributed(Flatten())(final_input)
final_model = Masking(mask_value = -2.)(final_model)
final_model = TimeDistributed(Reshape(IMG_SIZE))(final_model)
final_model = TimeDistributed(base_model)(final_model)
final_model = Model(final_input,final_model)

Run Code Online (Sandbox Code Playgroud)

我查看了 masking 的源代码，我注意到 Keras 创建了一个仅减少最后一个轴的 mask 张量。只要您处理 5D 张量，就不会出现问题，但是当您减少 LSTM 的维度时，该掩蔽张量就会变得不兼容。

在掩蔽之前执行第一个展平步骤将确保掩蔽张量适用于 3D 张量。然后再次将图像放大到原始大小。

我可能会很快尝试安装更新的版本来亲自测试它，但是这些安装过程造成了太多麻烦，而我正在处理一些重要的事情。

在我的机器上，这段代码可以编译，但是在预测时间内出现了这个奇怪的错误（请参阅此答案第一行的链接）。

创建用于预测中间层的模型

根据我所看到的代码，我不确定掩蔽函数是否保存在张量内部。我不知道它到底是如何工作的，但它似乎与层内功能的构建分开管理。

因此，尝试使用 keras 标准模型进行预测：

inp = final_model.input                                           # input placeholder
outputs = [layer.output for layer in final_model.layers]          # all layer outputs

fullModel = Model(inp,outputs)
layerPredictions = fullModel.predict(np.expand_dims(TEST_SAMPLE,0))

print(layerPredictions[-2])

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，10 月前
查看次数：	1169 次
最近记录：	7 年，6 月前