Abr*_*Ben 3 neural-network keras resnet
我正在尝试使用keras调整resnet 50。当我冻结resnet50中的所有图层时,一切正常。但是,我想冻结部分resnet50,而不是全部。但是,当我这样做时,我会遇到一些错误。这是我的代码:
base_model = ResNet50(include_top=False, weights="imagenet", input_shape=(input_size, input_size, input_channels))
model = Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(80, activation="softmax"))
#this is where the error happens. The commented code works fine
"""
for layer in base_model.layers:
layer.trainable = False
"""
for layer in base_model.layers[:-26]:
layer.trainable = False
model.summary()
optimizer = Adam(lr=1e-4)
model.compile(loss="categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])
callbacks = [
EarlyStopping(monitor='val_loss', patience=4, verbose=1, min_delta=1e-4),
ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=2, cooldown=2, verbose=1),
ModelCheckpoint(filepath='weights/renet50_best_weight.fold_' + str(fold_count) + '.hdf5', save_best_only=True,
save_weights_only=True)
]
model.load_weights(filepath="weights/renet50_best_weight.fold_1.hdf5")
model.fit_generator(generator=train_generator(), steps_per_epoch=len(df_train) // batch_size, epochs=epochs, verbose=1,
callbacks=callbacks, validation_data=valid_generator(), validation_steps = len(df_valid) // batch_size)
Run Code Online (Sandbox Code Playgroud)
错误如下:
Traceback (most recent call last):
File "/home/jamesben/ai_challenger/src/train.py", line 184, in <module> model.load_weights(filepath="weights/renet50_best_weight.fold_" + str(fold_count) + '.hdf5')
File "/usr/local/lib/python3.5/dist-packages/keras/models.py", line 719, in load_weights topology.load_weights_from_hdf5_group(f, layers)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 3095, in load_weights_from_hdf5_group K.batch_set_value(weight_value_tuples)
File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2193, in batch_set_value get_session().run(assign_ops, feed_dict=feed_dict)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 767, in run run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 944, in _run % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (128,) for Tensor 'Placeholder_72:0', which has shape '(3, 3, 128, 128)'
Run Code Online (Sandbox Code Playgroud)
谁能给我一些有关使用resnet50冻结多少层的帮助?
当使用嵌套模型load_weights()并save_weights()使用嵌套模型时,如果trainable设置不同,很容易出错。
要解决该错误,请确保在调用之前冻结相同的图层model.load_weights()。也就是说,如果保存重量文件时冻结了所有图层,则过程将是:
base_modelbase_model.layers[-26:])例如,
base_model = ResNet50(include_top=False, input_shape=(224, 224, 3))
model = Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(80, activation="softmax"))
for layer in base_model.layers:
layer.trainable = False
model.load_weights('all_layers_freezed.h5')
for layer in base_model.layers[-26:]:
layer.trainable = True
Run Code Online (Sandbox Code Playgroud)
调用model.load_weights()时(大致),每个图层的权重将通过以下步骤加载(load_weights_from_hdf5_group()在topology.py中的函数中):
layer.weights获取重量张量K.batch_set_value()以将权重值分配给权重张量如果您的模型是嵌套模型,则trainable由于第1步,您必须要小心。
我将用一个例子来解释它。对于与上述相同的模型,model.summary()给出:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
resnet50 (Model) (None, 1, 1, 2048) 23587712
_________________________________________________________________
flatten_10 (Flatten) (None, 2048) 0
_________________________________________________________________
dense_5 (Dense) (None, 80) 163920
=================================================================
Total params: 23,751,632
Trainable params: 11,202,640
Non-trainable params: 12,548,992
_________________________________________________________________
Run Code Online (Sandbox Code Playgroud)
内部ResNet50模型model在重量加载期间被视为一层。在加载图层时resnet50,在步骤1中,调用layer.weights等效于base_model.weights。ResNet50将收集并返回模型中所有层的重量张量列表。
现在的问题是,在构建权重张量列表时,可训练的权重将出现在不可训练的权重之前。在Layer类的定义中:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
resnet50 (Model) (None, 1, 1, 2048) 23587712
_________________________________________________________________
flatten_10 (Flatten) (None, 2048) 0
_________________________________________________________________
dense_5 (Dense) (None, 80) 163920
=================================================================
Total params: 23,751,632
Trainable params: 11,202,640
Non-trainable params: 12,548,992
_________________________________________________________________
Run Code Online (Sandbox Code Playgroud)
如果base_model冻结了所有图层,则重量张量将按以下顺序排列:
@property
def weights(self):
return self.trainable_weights + self.non_trainable_weights
Run Code Online (Sandbox Code Playgroud)
但是,如果某些层是可训练的,则可训练层的权重张量会比冻结层的权重张量大:
for layer in base_model.layers:
layer.trainable = False
print(base_model.weights)
[<tf.Variable 'conv1/kernel:0' shape=(7, 7, 3, 64) dtype=float32_ref>,
<tf.Variable 'conv1/bias:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'bn_conv1/gamma:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'bn_conv1/beta:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'bn_conv1/moving_mean:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'bn_conv1/moving_variance:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'res2a_branch2a/kernel:0' shape=(1, 1, 64, 64) dtype=float32_ref>,
<tf.Variable 'res2a_branch2a/bias:0' shape=(64,) dtype=float32_ref>,
...
<tf.Variable 'res5c_branch2c/kernel:0' shape=(1, 1, 512, 2048) dtype=float32_ref>,
<tf.Variable 'res5c_branch2c/bias:0' shape=(2048,) dtype=float32_ref>,
<tf.Variable 'bn5c_branch2c/gamma:0' shape=(2048,) dtype=float32_ref>,
<tf.Variable 'bn5c_branch2c/beta:0' shape=(2048,) dtype=float32_ref>,
<tf.Variable 'bn5c_branch2c/moving_mean:0' shape=(2048,) dtype=float32_ref>,
<tf.Variable 'bn5c_branch2c/moving_variance:0' shape=(2048,) dtype=float32_ref>]
Run Code Online (Sandbox Code Playgroud)
顺序的更改是为什么您会看到张量形状错误的原因。在上述步骤2中,保存在hdf5文件中的权重值与错误的权重张量匹配。冻结所有图层时,一切工作正常的原因是,冻结所有图层时也会保存模型检查点,因此顺序正确。
您可以使用功能性API避免嵌套模型。例如,以下代码应正常工作:
for layer in base_model.layers[-5:]:
layer.trainable = True
print(base_model.weights)
[<tf.Variable 'res5c_branch2c/kernel:0' shape=(1, 1, 512, 2048) dtype=float32_ref>,
<tf.Variable 'res5c_branch2c/bias:0' shape=(2048,) dtype=float32_ref>,
<tf.Variable 'bn5c_branch2c/gamma:0' shape=(2048,) dtype=float32_ref>,
<tf.Variable 'bn5c_branch2c/beta:0' shape=(2048,) dtype=float32_ref>,
<tf.Variable 'conv1/kernel:0' shape=(7, 7, 3, 64) dtype=float32_ref>,
<tf.Variable 'conv1/bias:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'bn_conv1/gamma:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'bn_conv1/beta:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'bn_conv1/moving_mean:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'bn_conv1/moving_variance:0' shape=(64,) dtype=float32_ref>,
<tf.Variable 'res2a_branch2a/kernel:0' shape=(1, 1, 64, 64) dtype=float32_ref>,
<tf.Variable 'res2a_branch2a/bias:0' shape=(64,) dtype=float32_ref>,
...
<tf.Variable 'bn5c_branch2b/moving_mean:0' shape=(512,) dtype=float32_ref>,
<tf.Variable 'bn5c_branch2b/moving_variance:0' shape=(512,) dtype=float32_ref>,
<tf.Variable 'bn5c_branch2c/moving_mean:0' shape=(2048,) dtype=float32_ref>,
<tf.Variable 'bn5c_branch2c/moving_variance:0' shape=(2048,) dtype=float32_ref>]
Run Code Online (Sandbox Code Playgroud)