为什么我们需要在PyTorch中明确归零渐变?loss.backward()调用时为什么渐变不能归零?通过在图表上保持渐变并要求用户明确归零渐变来实现什么样的场景?
此训练代码基于此处找到的run_glue.py脚本:
# Set the seed value all over the place to make this reproducible.
seed_val = 42
random.seed(seed_val)
np.random.seed(seed_val)
torch.manual_seed(seed_val)
torch.cuda.manual_seed_all(seed_val)
# Store the average loss after each epoch so we can plot them.
loss_values = []
# For each epoch...
for epoch_i in range(0, epochs):
# ========================================
# Training
# ========================================
# Perform one full pass over the training set.
print("")
print('======== Epoch {:} / {:} ========'.format(epoch_i + 1, epochs))
print('Training...')
# Measure how long the …Run Code Online (Sandbox Code Playgroud) 我试图理解PyTorch. 我的问题与这两个有些相关:
为什么我们需要在 PyTorch 中调用 zero_grad()?
对第二个问题的已接受答案的评论表明,如果小批量太大而无法在单个前向传递中执行梯度更新,因此必须将其拆分为多个子批次,则可以使用累积梯度。
考虑以下玩具示例:
import numpy as np
import torch
class ExampleLinear(torch.nn.Module):
def __init__(self):
super().__init__()
# Initialize the weight at 1
self.weight = torch.nn.Parameter(torch.Tensor([1]).float(),
requires_grad=True)
def forward(self, x):
return self.weight * x
if __name__ == "__main__":
# Example 1
model = ExampleLinear()
# Generate some data
x = torch.from_numpy(np.array([4, 2])).float()
y = 2 * x
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
y_hat = model(x) # forward pass
loss = (y - y_hat) ** 2 …Run Code Online (Sandbox Code Playgroud) 我是 Python 和 PyTorch 的学生和初学者。我有一个非常基本的神经网络,我遇到了提到的 RunTimeError。重现错误的代码是这样的:
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
import matplotlib.pyplot as plt
# Ensure Reproducibility
torch.manual_seed(0)
# Data Generation
x = torch.randn((100,1), requires_grad = True)
y = 1 + 2 * x + 0.3 * torch.randn(100,1)
# Shuffles the indices
idx = np.arange(100)
np.random.shuffle(idx)
# Uses first 80 random indices for train
train_idx = idx[:70]
# Uses the remaining indices for validation
val_idx = idx[70:]
# Generates train …Run Code Online (Sandbox Code Playgroud)