我正在训练一个神经网络并使用 RMSprop 作为优化器和 OneCycleLR 作为调度器。我一直在这样运行它(以稍微简化的代码):
optimizer = torch.optim.RMSprop(model.parameters(), lr=0.00001,
alpha=0.99, eps=1e-08, weight_decay=0.0001, momentum=0.0001, centered=False)
scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.0005, epochs=epochs)
for epoch in range(epochs):
model.train()
for counter, (images, targets) in enumerate(train_loader):
# clear gradients from last run
optimizer.zero_grad()
# Run forward pass through the mini-batch
outputs = model(images)
# Calculate the losses
loss = loss_fn(outputs, targets)
# Calculate the gradients
loss.backward()
# Update parameters
optimizer.step() # Optimizer before scheduler????
scheduler.step()
# Check loss on training set
test()
Run Code Online (Sandbox Code Playgroud)
注意每个小批量中的优化器和调度器调用。这是有效的,尽管当我通过训练绘制学习率时,曲线非常崎岖。我再次检查了文档,这是显示的示例torch.optim.lr_scheduler.OneCycleLR
>>> data_loader …Run Code Online (Sandbox Code Playgroud) pytorch ×1