Sla*_*myl 14 tensorflow tensorflow2.0
如何在 TF2 中进行学习的同时更改 Adam 优化器的学习率?有一些答案浮出水面,但适用于 TF1,例如使用 feed_dict。
Ali*_*ehi 16
如果您使用自定义训练循环(而不是keras.fit()),您可以简单地执行以下操作:
new_learning_rate = 0.01
my_optimizer.lr.assign(new_learning_rate)
Run Code Online (Sandbox Code Playgroud)
Ste*_*t_R 13
您可以通过回调读取和分配学习率。所以你可以使用这样的东西:
class LearningRateReducerCb(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
old_lr = self.model.optimizer.lr.read_value()
new_lr = old_lr * 0.99
print("\nEpoch: {}. Reducing Learning Rate from {} to {}".format(epoch, old_lr, new_lr))
self.model.optimizer.lr.assign(new_lr)
Run Code Online (Sandbox Code Playgroud)
例如,使用MNIST 演示可以这样应用:
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, callbacks=[LearningRateReducerCb()], epochs=5)
model.evaluate(x_test, y_test)
Run Code Online (Sandbox Code Playgroud)
给出这样的输出:
Train on 60000 samples
Epoch 1/5
59744/60000 [============================>.] - ETA: 0s - loss: 0.2969 - accuracy: 0.9151
Epoch: 0. Reducing Learning Rate from 0.0010000000474974513 to 0.0009900000877678394
60000/60000 [==============================] - 6s 92us/sample - loss: 0.2965 - accuracy: 0.9152
Epoch 2/5
59488/60000 [============================>.] - ETA: 0s - loss: 0.1421 - accuracy: 0.9585
Epoch: 1. Reducing Learning Rate from 0.0009900000877678394 to 0.000980100128799677
60000/60000 [==============================] - 5s 91us/sample - loss: 0.1420 - accuracy: 0.9586
Epoch 3/5
59968/60000 [============================>.] - ETA: 0s - loss: 0.1056 - accuracy: 0.9684
Epoch: 2. Reducing Learning Rate from 0.000980100128799677 to 0.0009702991228550673
60000/60000 [==============================] - 5s 91us/sample - loss: 0.1056 - accuracy: 0.9684
Epoch 4/5
59520/60000 [============================>.] - ETA: 0s - loss: 0.0856 - accuracy: 0.9734
Epoch: 3. Reducing Learning Rate from 0.0009702991228550673 to 0.0009605961386114359
60000/60000 [==============================] - 5s 89us/sample - loss: 0.0857 - accuracy: 0.9733
Epoch 5/5
59712/60000 [============================>.] - ETA: 0s - loss: 0.0734 - accuracy: 0.9772
Epoch: 4. Reducing Learning Rate from 0.0009605961386114359 to 0.0009509901865385473
60000/60000 [==============================] - 5s 87us/sample - loss: 0.0733 - accuracy: 0.9772
10000/10000 [==============================] - 0s 43us/sample - loss: 0.0768 - accuracy: 0.9762
[0.07680597708942369, 0.9762]
Run Code Online (Sandbox Code Playgroud)
如果您想使用低级控制而不是fit回调功能,请查看tf.optimizers.schedules. 下面是一些示例代码:
train_steps = 25000
lr_fn = tf.optimizers.schedules.PolynomialDecay(1e-3, train_steps, 1e-5, 2)
opt = tf.optimizers.Adam(lr_fn)
Run Code Online (Sandbox Code Playgroud)
这将使学习率从 1e-3 衰减到 1e-5 超过 25000 步,并具有 2 次幂多项式衰减。
笔记:
Optimizer实例有一个内部计步器,每次apply_gradients调用时都会加一(据我所知......)。这允许此过程在低级上下文中使用它时正常工作(通常使用tf.GradientTape)| 归档时间: |
|
| 查看次数: |
8662 次 |
| 最近记录: |