如何在我的代码中检查pytorch中每一层的输出梯度?

NEW*_*ONS 8 python gradient artificial-intelligence machine-learning pytorch

我正在pytorch上工作来学习。

还有一个问题如何检查我的代码中每一层的输出梯度。

我的代码如下

#import the nescessary libs
import numpy as np
import torch
import time

# Loading the Fashion-MNIST dataset
from torchvision import datasets, transforms

# Get GPU Device

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")


# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
                                    transforms.Normalize((0.5,), (0.5,))
                                                                   ])
# Download and load the training data
trainset = datasets.FashionMNIST('MNIST_data/', download = True, train = True, transform = transform)
testset = datasets.FashionMNIST('MNIST_data/', download = True, train = False, transform = transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size = 32, shuffle = True, num_workers=4)
testloader = torch.utils.data.DataLoader(testset, batch_size = 32, shuffle = True, num_workers=4)

# Examine a sample
dataiter = iter(trainloader)
images, labels = dataiter.next()

# Define the network architecture
from torch import nn, optim
import torch.nn.functional as F

model = nn.Sequential(nn.Linear(784, 128),
                      nn.ReLU(),
                      nn.Linear(128, 10),
                      nn.LogSoftmax(dim = 1)
                     )
model.to(device)

# Define the loss
criterion = nn.CrossEntropyLoss()

# Define the optimizer
optimizer = optim.Adam(model.parameters(), lr = 0.001)

# Define the epochs
epochs = 5

train_losses, test_losses = [], []

# start = time.time()
for e in range(epochs):
    running_loss = 0
    
    for images, labels in trainloader:
    # Flatten Fashion-MNIST images into a 784 long vector
        images = images.to(device)
        labels = labels.to(device)
        images = images.view(images.shape[0], -1)
        

    # Training pass
        optimizer.zero_grad()
    
        output = model.forward(images)
        loss = criterion(output, labels)
        
        loss.backward()
        
#         print(loss.grad)
        
        optimizer.step()

        running_loss += loss.item()
    
    else:
        print(model[0].grad)
Run Code Online (Sandbox Code Playgroud)

如果我在反向传播后打印 model[0].grad ,它会是每个时期每层的输出梯度吗?

或者,如果我想知道每层的输出梯度,我应该在哪里打印什么?

谢谢你!!

感谢您的阅读

Sat*_*ash 19

好吧,如果您需要了解模型中的内部计算,这是一个好问题。让我来给你解释一下吧!

因此,首先当您打印model变量时,您将得到以下输出:

Sequential(
  (0): Linear(in_features=784, out_features=128, bias=True)
  (1): ReLU()
  (2): Linear(in_features=128, out_features=10, bias=True)
  (3): LogSoftmax(dim=1)
)
Run Code Online (Sandbox Code Playgroud)

如果您选择model[0],则表示您选择了模型的第一层。那是Linear(in_features=784, out_features=128, bias=True)。如果您查看torch.nn.Linear 此处的文档,您会发现该类有两个可以访问的变量。一个是Linear.weight,另一个是Linear.bias,它将分别为您提供相应层的权重和偏差。

请记住,您不能用来model.weight查看模型的权重,因为您的线性层保存在一个名为 的容器内,nn.Sequential该容器没有weight属性。

因此,回到权重和偏差,您可以在每层访问它们。因此model[0].weightmodel[0].bias是第一层的权重和偏差。与访问第一层的梯度类似,model[0].weight.gradmodel[0].bias.grad是梯度。