如何将掩蔽层应用于 Keras 中的顺序 CNN 模型?

Gao*_*Gao 5 mask masking conv-neural-network lstm keras

我在 RNN/LSTM 模型中将掩蔽层应用于 CNN 时遇到问题。

我的数据不是原始图像,但我转换成(16,34,4)(channels_first)的形状。数据是连续的,最长的步长是22。所以为了不变的方式,我将时间步设置为22。由于它可能比22步短,所以我用np.zeros填充其他的。然而,对于 0 padding 的数据来说,大约占所有数据集的一半,所以在 0 padding 的情况下,有这么多无用的数据,训练无法达到很好的结果。然后我想添加一个掩码来取消这些0填充数据。

这是我的代码。

mask = np.zeros((16,34,4), dtype = np.int8)  
input_shape = (22, 16, 34, 4)  
model = Sequential()  
model.add(TimeDistributed(Masking(mask_value=mask), input_shape=input_shape, name = 'mask'))  
model.add(TimeDistributed(Conv2D(100, (5, 2), data_format = 'channels_first', activation = relu), name = 'conv1'))  
model.add(TimeDistributed(BatchNormalization(), name = 'bn1'))  
model.add(Dropout(0.5, name = 'drop1'))  
model.add(TimeDistributed(Conv2D(100, (5, 2), data_format = 'channels_first', activation = relu), name ='conv2'))  
model.add(TimeDistributed(BatchNormalization(), name = 'bn2'))  
model.add(Dropout(0.5, name = 'drop2'))  
model.add(TimeDistributed(Conv2D(100, (5, 2), data_format = 'channels_first', activation = relu), name ='conv3'))  
model.add(TimeDistributed(BatchNormalization(), name = 'bn3'))  
model.add(Dropout(0.5, name = 'drop3'))  
model.add(TimeDistributed(Flatten(), name = 'flatten'))  
model.add(GRU(256, activation='tanh', return_sequences=True, name = 'gru'))  
model.add(Dropout(0.4, name = 'drop_gru'))  
model.add(Dense(35, activation = 'softmax', name = 'softmax'))  
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['acc'])
Run Code Online (Sandbox Code Playgroud)

这是模型结构。
模型.summary():

_________________________________________________________________  
Layer (type)                 Output Shape              Param #     
=================================================================  
mask (TimeDist (None, 22, 16, 34, 4)     0           
_________________________________________________________________  
conv1 (TimeDistributed)      (None, 22, 100, 30, 3)    16100       
_________________________________________________________________  
bn1 (TimeDistributed)        (None, 22, 100, 30, 3)    12          
_________________________________________________________________  
drop1 (Dropout)              (None, 22, 100, 30, 3)    0           
_________________________________________________________________  
conv2 (TimeDistributed)      (None, 22, 100, 26, 2)    100100      
_________________________________________________________________  
bn2 (TimeDistributed)        (None, 22, 100, 26, 2)    8           
_________________________________________________________________  
drop2 (Dropout)              (None, 22, 100, 26, 2)    0           
_________________________________________________________________  
conv3 (TimeDistributed)      (None, 22, 100, 22, 1)    100100      
_________________________________________________________________  
bn3 (TimeDistributed)        (None, 22, 100, 22, 1)    4           
_________________________________________________________________  
drop3 (Dropout)              (None, 22, 100, 22, 1)    0           
_________________________________________________________________  
flatten (TimeDistributed)    (None, 22, 2200)          0           
_________________________________________________________________  
gru (GRU)                    (None, 22, 256)           1886976     
_________________________________________________________________  
drop_gru (Dropout)           (None, 22, 256)           0           
_________________________________________________________________  
softmax (Dense)              (None, 22, 35)            8995        
=================================================================  
Total params: 2,112,295  
Trainable params: 2,112,283  
Non-trainable params: 12  
_________________________________________________________________
Run Code Online (Sandbox Code Playgroud)

对于 mask_value,我尝试使用 0 或此掩码结构,但两者都不起作用,并且它仍然通过其中有一半 0 填充的所有数据进行训练。
谁能帮我?

顺便说一句,我在这里使用 TimeDistributed 来连接 RNN,我知道另一种叫做 ConvLSTM2D。有谁知道其中的区别吗?ConvLSTM2D 需要更多的模型参数,并且训练速度比 TimeDistributed 慢得多...

lun*_*ini 5

不幸的是,Keras Conv 层尚不支持掩蔽。Keras Github 页面上发布了几个与此相关的问题,这是关于该主题的讨论最多的一个。似乎存在一些挂起的实施细节,并且该问题从未得到解决。

讨论中提出的解决方法是在序列中显式嵌入填充字符并进行全局池化。这是我发现的另一个解决方法(对我的用例没有帮助,但可能对您有帮助) - 保持掩码数组通过乘法合并。

您还可以查看与您类似的围绕此问题的对话。