如何在训练循环（和优化器/调度器交互）中使用 Pytorch OneCycleLR？

Question

如何在训练循环（和优化器/调度器交互）中使用 Pytorch OneCycleLR？

我正在训练一个神经网络并使用 RMSprop 作为优化器和 OneCycleLR 作为调度器。我一直在这样运行它（以稍微简化的代码）：

optimizer = torch.optim.RMSprop(model.parameters(), lr=0.00001, 
                              alpha=0.99, eps=1e-08, weight_decay=0.0001, momentum=0.0001, centered=False)
scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.0005, epochs=epochs)

    for epoch in range(epochs):
        model.train()
        for counter, (images, targets) in enumerate(train_loader):

            # clear gradients from last run
            optimizer.zero_grad()

            # Run forward pass through the mini-batch
            outputs = model(images)

            # Calculate the losses
            loss = loss_fn(outputs, targets)

            # Calculate the gradients
            loss.backward()

            # Update parameters
            optimizer.step()   # Optimizer before scheduler????
            scheduler.step()

            # Check loss on training set
            test()

Run Code Online (Sandbox Code Playgroud)

注意每个小批量中的优化器和调度器调用。这是有效的，尽管当我通过训练绘制学习率时，曲线非常崎岖。我再次检查了文档，这是显示的示例torch.optim.lr_scheduler.OneCycleLR

>>> data_loader = torch.utils.data.DataLoader(...)
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9)
>>> scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.01, steps_per_epoch=len(data_loader), epochs=10)
>>> for epoch in range(10):
>>>     for batch in data_loader:
>>>         train_batch(...)
>>>         scheduler.step()

Run Code Online (Sandbox Code Playgroud)

在这里，他们省略optimizer.step()了训练循环中的。我认为，这是有道理的，因为优化器在初始化时提供给 OneCycleLR，所以它必须在后端处理这个问题。但这样做让我得到警告：

UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.

Run Code Online (Sandbox Code Playgroud)

我是否忽略并相信文档中的伪代码？嗯，我做了，模型没有做任何学习，所以警告是正确的，我又放optimizer.step()回去了。

这到了我并不真正理解优化器和调度器如何交互的地步（编辑：优化器中的学习率如何与调度器中的学习率交互）。我看到优化器通常在每个小批量运行，调度程序在每个时期运行，但对于 OneCycleLR，他们也希望您在每个小批量运行它。

任何指导（或一篇好的教程文章）将不胜感激！

Answer 1

aks*_*k07 6

optimizer.step()之前使用scheduler.step()。此外，对于OneCycleLR，您需要scheduler.step() 在每一步之后运行- source (PyTorch docs)。因此，您的训练代码是正确的（就调用step()优化器和调度器而言）。

另外，在你提到的例子中，他们已经传递了steps_per_epoch参数，但你没有在你的训练代码中这样做。文档中也提到了这一点。这可能会导致您的代码出现问题。

归档时间：	5 年，10 月前
查看次数：	5239 次
最近记录：	5 年，10 月前