Pytorch 中的正确验证损失?

Sta*_*Geo 4 deep-learning pytorch

我对如何计算验证损失有点困惑?验证损失是在一个时期结束时计算还是应该在批次迭代期间监控损失?下面我使用 running_loss 进行了计算,该损失是分批累积的 - 但我想看看它是否是正确的方法?

def validate(loader, model, criterion):                       
    correct = 0                                               
    total = 0                                                 
    running_loss = 0.0                                        
    model.eval()                                              
    with torch.no_grad():                                     
        for i, data in enumerate(loader):                     
            inputs, labels = data                             
            inputs = inputs.to(device)                        
            labels = labels.to(device)                        
                                                              
            outputs = model(inputs)                           
            loss = criterion(outputs, labels)                 
            _, predicted = torch.max(outputs.data, 1)         
            total += labels.size(0)                           
            correct += (predicted == labels).sum().item()     
            running_loss = running_loss + loss.item()         
    mean_val_accuracy = (100 * correct / total)               
    mean_val_loss = ( running_loss )                  
    #mean_val_accuracy = accuracy(outputs,labels)             
    print('Validation Accuracy: %d %%' % (mean_val_accuracy)) 
    print('Validation Loss:'  ,mean_val_loss )                
Run Code Online (Sandbox Code Playgroud)

下面是我正在使用的训练块

def train(loader, model, criterion, optimizer, epoch):                                   
    correct = 0                                                                          
    running_loss = 0.0                                                                   
    i_max = 0                                                                            
    for i, data in enumerate(loader):                                                    
        total_loss = 0.0                                                                 
        #print('batch=',i)                                                               
        inputs, labels = data                                                            
        inputs = inputs.to(device)                                                       
        labels = labels.to(device)                                                       
                                                                                         
        optimizer.zero_grad()                                                            
        outputs = model(inputs)                                                          
        loss = criterion(outputs, labels)                                                
        loss.backward()                                                                  
        optimizer.step()                                                                 
                                                                                         
        running_loss += loss.item()                                                      
        if i % 2000 == 1999:                                                             
            print('[%d , %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000))     
            running_loss = 0.0                                                           
                                                                                         
    print('finished training')
    return mean_val_loss, mean_val_accuracy
Run Code Online (Sandbox Code Playgroud)

Lou*_*Lac 6

您可以根据需要在验证中评估您的网络。它可以是每个纪元,或者如果由于数据集巨大而成本太高,则可以是每个N纪元。

您所做的似乎是正确的,您计算了整个验证集的损失。您可以选择除以它的长度以使损失标准化,因此如果您有一天增加验证集,比例将是相同的。