NEW*_*ONS 8 python gradient artificial-intelligence machine-learning pytorch
我正在pytorch上工作来学习。
还有一个问题如何检查我的代码中每一层的输出梯度。
我的代码如下
#import the nescessary libs
import numpy as np
import torch
import time
# Loading the Fashion-MNIST dataset
from torchvision import datasets, transforms
# Get GPU Device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
# Download and load the training data
trainset = datasets.FashionMNIST('MNIST_data/', download = True, train = True, transform = transform)
testset = datasets.FashionMNIST('MNIST_data/', download = True, train = False, transform = transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size = 32, shuffle = True, num_workers=4)
testloader = torch.utils.data.DataLoader(testset, batch_size = 32, shuffle = True, num_workers=4)
# Examine a sample
dataiter = iter(trainloader)
images, labels = dataiter.next()
# Define the network architecture
from torch import nn, optim
import torch.nn.functional as F
model = nn.Sequential(nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 10),
nn.LogSoftmax(dim = 1)
)
model.to(device)
# Define the loss
criterion = nn.CrossEntropyLoss()
# Define the optimizer
optimizer = optim.Adam(model.parameters(), lr = 0.001)
# Define the epochs
epochs = 5
train_losses, test_losses = [], []
# start = time.time()
for e in range(epochs):
running_loss = 0
for images, labels in trainloader:
# Flatten Fashion-MNIST images into a 784 long vector
images = images.to(device)
labels = labels.to(device)
images = images.view(images.shape[0], -1)
# Training pass
optimizer.zero_grad()
output = model.forward(images)
loss = criterion(output, labels)
loss.backward()
# print(loss.grad)
optimizer.step()
running_loss += loss.item()
else:
print(model[0].grad)
Run Code Online (Sandbox Code Playgroud)
如果我在反向传播后打印 model[0].grad ,它会是每个时期每层的输出梯度吗?
或者,如果我想知道每层的输出梯度,我应该在哪里打印什么?
谢谢你!!
感谢您的阅读
Sat*_*ash 19
好吧,如果您需要了解模型中的内部计算,这是一个好问题。让我来给你解释一下吧!
因此,首先当您打印model
变量时,您将得到以下输出:
Sequential(
(0): Linear(in_features=784, out_features=128, bias=True)
(1): ReLU()
(2): Linear(in_features=128, out_features=10, bias=True)
(3): LogSoftmax(dim=1)
)
Run Code Online (Sandbox Code Playgroud)
如果您选择model[0]
,则表示您选择了模型的第一层。那是Linear(in_features=784, out_features=128, bias=True)
。如果您查看torch.nn.Linear
此处的文档,您会发现该类有两个可以访问的变量。一个是Linear.weight,另一个是Linear.bias,它将分别为您提供相应层的权重和偏差。
请记住,您不能用来model.weight
查看模型的权重,因为您的线性层保存在一个名为 的容器内,nn.Sequential
该容器没有weight
属性。
因此,回到权重和偏差,您可以在每层访问它们。因此model[0].weight
和model[0].bias
是第一层的权重和偏差。与访问第一层的梯度类似,model[0].weight.grad
将model[0].bias.grad
是梯度。
归档时间: |
|
查看次数: |
18769 次 |
最近记录: |