Sta*_*ter 6 python machine-learning keras tensorflow
我有一套相当复杂的模型,我正在训练,我正在寻找一种方法来保存和加载模型优化器状态."训练模型"由几个其他"体重模型"的不同组合组成,其中一些具有共同的权重,一些具有取决于训练者的冻结权重等.分享的例子有点过于复杂,但简而言之,我无法使用model.save('model_file.h5'),keras.models.load_model('model_file.h5')停止和开始训练时.
model.load_weights('weight_file.h5')如果训练已经完成,使用可以很好地测试我的模型,但是如果我尝试使用这种方法继续训练模型,那么损失甚至不会回到最后位置.我已经读过这是因为没有使用这种方法保存优化器状态是有意义的.但是,我需要一种方法来保存和加载我的教练模型的优化器的状态.似乎keras曾经拥有过,model.optimizer.get_sate()而且model.optimizer.set_sate()这将完成我所追求的目标,但似乎不再是这种情况了(至少对于Adam优化器而言).当前的Keras还有其他解决方案吗?
Yu-*_*ang 14
您可以从load_model和save_model函数中提取重要的行.
要保存优化程序状态,请在save_model:
# Save optimizer weights.
symbolic_weights = getattr(model.optimizer, 'weights')
if symbolic_weights:
optimizer_weights_group = f.create_group('optimizer_weights')
weight_values = K.batch_get_value(symbolic_weights)
Run Code Online (Sandbox Code Playgroud)
对于加载优化器状态,在load_model:
# Set optimizer weights.
if 'optimizer_weights' in f:
# Build train function (to get weight updates).
if isinstance(model, Sequential):
model.model._make_train_function()
else:
model._make_train_function()
# ...
try:
model.optimizer.set_weights(optimizer_weight_values)
Run Code Online (Sandbox Code Playgroud)
结合上面的行,这是一个例子:
X, y = np.random.rand(100, 50), np.random.randint(2, size=100)
x = Input((50,))
out = Dense(1, activation='sigmoid')(x)
model = Model(x, out)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.fit(X, y, epochs=5)
Epoch 1/5
100/100 [==============================] - 0s 4ms/step - loss: 0.7716
Epoch 2/5
100/100 [==============================] - 0s 64us/step - loss: 0.7678
Epoch 3/5
100/100 [==============================] - 0s 82us/step - loss: 0.7665
Epoch 4/5
100/100 [==============================] - 0s 56us/step - loss: 0.7647
Epoch 5/5
100/100 [==============================] - 0s 76us/step - loss: 0.7638
Run Code Online (Sandbox Code Playgroud)
model.save_weights('weights.h5')
symbolic_weights = getattr(model.optimizer, 'weights')
weight_values = K.batch_get_value(symbolic_weights)
with open('optimizer.pkl', 'wb') as f:
pickle.dump(weight_values, f)
Run Code Online (Sandbox Code Playgroud)
x = Input((50,))
out = Dense(1, activation='sigmoid')(x)
model = Model(x, out)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.load_weights('weights.h5')
model._make_train_function()
with open('optimizer.pkl', 'rb') as f:
weight_values = pickle.load(f)
model.optimizer.set_weights(weight_values)
Run Code Online (Sandbox Code Playgroud)
model.fit(X, y, epochs=5)
Epoch 1/5
100/100 [==============================] - 0s 674us/step - loss: 0.7629
Epoch 2/5
100/100 [==============================] - 0s 49us/step - loss: 0.7617
Epoch 3/5
100/100 [==============================] - 0s 49us/step - loss: 0.7611
Epoch 4/5
100/100 [==============================] - 0s 55us/step - loss: 0.7601
Epoch 5/5
100/100 [==============================] - 0s 49us/step - loss: 0.7594
Run Code Online (Sandbox Code Playgroud)
Ale*_*ick 10
对于那些不使用model.compile而是执行自动微分以手动应用渐变的人optimizer.apply_gradients,我想我有一个解决方案。
首先,保存优化器权重: np.save(path, optimizer.get_weights())
然后,当您准备好重新加载优化器时,通过调用optimizer.apply_gradients您为其计算梯度的变量大小的张量列表,向新实例化的优化器显示它将更新的权重大小。在设置优化器的权重之后设置模型的权重非常重要,因为即使我们给模型的梯度为零,基于动量的优化器(如 Adam)也会更新模型的权重。
import tensorflow as tf
import numpy as np
model = # instantiate model (functional or subclass of tf.keras.Model)
# Get saved weights
opt_weights = np.load('/path/to/saved/opt/weights.npy', allow_pickle=True)
grad_vars = model.trainable_weights
# This need not be model.trainable_weights; it must be a correctly-ordered list of
# grad_vars corresponding to how you usually call the optimizer.
optimizer = tf.keras.optimizers.Adam(lrate)
zero_grads = [tf.zeros_like(w) for w in grad_vars]
# Apply gradients which don't do nothing with Adam
optimizer.apply_gradients(zip(zero_grads, grad_vars))
# Set the weights of the optimizer
optimizer.set_weights(opt_weights)
# NOW set the trainable weights of the model
model_weights = np.load('/path/to/saved/model/weights.npy', allow_pickle=True)
model.set_weights(model_weights)
Run Code Online (Sandbox Code Playgroud)
请注意,如果我们apply_gradients在第一次调用之前尝试设置权重,则会引发错误,优化器期望长度为零的权重列表。
| 归档时间: |
|
| 查看次数: |
4406 次 |
| 最近记录: |