我正在尝试在PyTorch中手动实现梯度下降作为学习练习。我有以下内容来创建合成数据集:
import torch
torch.manual_seed(0)
N = 100
x = torch.rand(N,1)*5
# Let the following command be the true function
y = 2.3 + 5.1*x
# Get some noisy observations
y_obs = y + 2*torch.randn(N,1)
Run Code Online (Sandbox Code Playgroud)
然后y_pred,如下所示创建我的预测函数()。
w = torch.randn(1, requires_grad=True)
b = torch.randn(1, requires_grad=True)
y_pred = w*x+b
mse = torch.mean((y_pred-y_obs)**2)
Run Code Online (Sandbox Code Playgroud)
使用MSE推断权重w,b。我使用下面的块根据渐变更新值。
gamma = 1e-2
for i in range(100):
w = w - gamma *w.grad
b = b - gamma *b.grad
mse.backward()
Run Code Online (Sandbox Code Playgroud)
但是,该循环仅在第一次迭代中有效。之后的第二次迭代w.grad设置为None。我相当确定发生这种情况的原因是因为我将w设置为其自身的函数(我可能错了)。
问题是如何使用梯度信息正确地更新权重?
The following code works fine on my computer and gives w=5.1 & b=2.2 after 500 iterations training.
Code:
import torch
torch.manual_seed(0)
N = 100
x = torch.rand(N,1)*5
# Let the following command be the true function
y = 2.3 + 5.1*x
# Get some noisy observations
y_obs = y + 0.2*torch.randn(N,1)
w = torch.randn(1, requires_grad=True)
b = torch.randn(1, requires_grad=True)
gamma = 0.01
for i in range(500):
print(i)
# use new weight to calculate loss
y_pred = w * x + b
mse = torch.mean((y_pred - y_obs) ** 2)
# backward
mse.backward()
print('w:', w)
print('b:', b)
print('w.grad:', w.grad)
print('b.grad:', b.grad)
# gradient descent, don't track
with torch.no_grad():
w = w - gamma * w.grad
b = b - gamma * b.grad
w.requires_grad = True
b.requires_grad = True
Run Code Online (Sandbox Code Playgroud)
Output:
499
w: tensor([5.1095], requires_grad=True)
b: tensor([2.2474], requires_grad=True)
w.grad: tensor([0.0179])
b.grad: tensor([-0.0576])
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2476 次 |
| 最近记录: |