我可以在keras图层之间共享权重,但其他参数有何不同?

mit*_*hus 12 deep-learning keras

在keras中,是否可以在两个层之间共享权重,但是其他参数是否有所不同?考虑以下(当然有点做作)的例子:

conv1 = Conv2D(64, 3, input_shape=input_shape, padding='same')
conv2 = Conv2D(64, 3, input_shape=input_shape, padding='valid')
Run Code Online (Sandbox Code Playgroud)

请注意,除了padding.我可以让keras为两者使用相同的权重吗?(即还要相应地训练网络?)

我查看了keras doc,关于共享层部分似乎暗示只有在层完全相同时共享才有效.

Yu-*_*ang 17

据我所知,这不能通过Keras使用的常见"API级别"来完成.但是,如果你深入挖掘,有一些(丑陋的)方法来分担权重.

首先,通过调用以下函数Conv2Dbuild()函数内创建图层的权重add_weight():

    self.kernel = self.add_weight(shape=kernel_shape,
                                  initializer=self.kernel_initializer,
                                  name='kernel',
                                  regularizer=self.kernel_regularizer,
                                  constraint=self.kernel_constraint)
Run Code Online (Sandbox Code Playgroud)

对于您提供的用法(即默认的trainable/ constraint/ regularizer/ initializer),add_weight()没有什么特别的,但是将权重变量附加到_trainable_weights:

    weight = K.variable(initializer(shape), dtype=dtype, name=name)
    ...
        self._trainable_weights.append(weight)
Run Code Online (Sandbox Code Playgroud)

最后,由于build()仅在__call__()未构建图层时才调用内部,因此可以通过以下方式创建图层之间的共享权重:

  1. 调用conv1.build()初始化要共享的变量conv1.kernelconv1.bias变量.
  2. 调用conv2.build()初始化图层.
  3. 替换conv2.kernelconv2.biasconv1.kernelconv1.bias.
  4. 删除conv2.kernelconv2.biasconv2._trainable_weights.
  5. 追加conv1.kernelconv1.biasconv2._trainable_weights.
  6. 完成模型定义.这里conv2.__call__()将被称为; 但是,由于conv2已经建成,权重不会被重新初始化.

以下代码段可能会有所帮助:

def create_shared_weights(conv1, conv2, input_shape):
    with K.name_scope(conv1.name):
        conv1.build(input_shape)
    with K.name_scope(conv2.name):
        conv2.build(input_shape)
    conv2.kernel = conv1.kernel
    conv2.bias = conv1.bias
    conv2._trainable_weights = []
    conv2._trainable_weights.append(conv2.kernel)
    conv2._trainable_weights.append(conv2.bias)

# check if weights are successfully shared
input_img = Input(shape=(299, 299, 3))
conv1 = Conv2D(64, 3, padding='same')
conv2 = Conv2D(64, 3, padding='valid')
create_shared_weights(conv1, conv2, input_img._keras_shape)
print(conv2.weights == conv1.weights)  # True

# check if weights are equal after model fitting
left = conv1(input_img)
right = conv2(input_img)
left = GlobalAveragePooling2D()(left)
right = GlobalAveragePooling2D()(right)
merged = concatenate([left, right])
output = Dense(1)(merged)
model = Model(input_img, output)
model.compile(loss='binary_crossentropy', optimizer='adam')

X = np.random.rand(5, 299, 299, 3)
Y = np.random.randint(2, size=5)
model.fit(X, Y)
print([np.all(w1 == w2) for w1, w2 in zip(conv1.get_weights(), conv2.get_weights())])  # [True, True]
Run Code Online (Sandbox Code Playgroud)

这种hacky重量共享的一个缺点是在保存/加载模型后权重不会保持共享.这不会影响预测,但如果您想加载经过训练的模型以进行进一步微调,则可能会出现问题.