如何在pytorch中实现对抗性示例?

Cha*_*ker 6 python machine-learning neural-network conv-neural-network pytorch

我想重现:

来自论文https://arxiv.org/pdf/1312.6199.pdf.我想知道,如何在pytorch中实现这一点?我的主要困惑是因为loss_f我正在使用一个torch.nn.CrossEntropy()标准.我只需要更改我已经拥有的代码:

loss = criterion(outputs+r, labels)
loss.backward()
Run Code Online (Sandbox Code Playgroud)

至:

loss = criterion(outputs+r, labels)
loss = loss + c * r.norm(2)
loss.backward()
Run Code Online (Sandbox Code Playgroud)

或者那些东西(当然包括优化器中的r!).我知道它不太对,因为我没有明确地说明我是如何实现的x+r或超立方体约束,但那些是我仍需要弄清楚的部分.

我想目前我想在没有超立方体约束的情况下首先关注.如果我们假设我没事,那么上述是正确的吗?我只是想知道:

loss = loss + c * r.norm(2)
Run Code Online (Sandbox Code Playgroud)

尽力而为.


现在,如果我们确实包含超立方体约束,我的解决方案将如何变化?这是"惩罚功能方法"到位吗?


https://discuss.pytorch.org/t/how-does-one-implement-adversarial-examples-in-pytorch/15668

Man*_*rya 7

这就是我做到的.希望对你有所帮助.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sat Nov 18 12:39:16 2016

@author: manojacharya
"""

import torch
import torch.nn as nn
from torch.autograd import Variable
from torchvision import models,transforms
import numpy as np
from scipy.misc import imread, imresize
import os
import matplotlib.pyplot as plt
import torch.nn.functional as F
import json



normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225])

transform = transforms.Compose(
    [transforms.ToTensor(),
     normalize])


def imshow(inp, title=None):
    """Imshow for Tensor."""
    plt.figure()
    inp = inp.data[0]
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = std * inp + mean
    plt.imshow(inp)
    plt.axis('off')
    if title is not None:
        plt.title(title)

with open('imagenet.json') as f:
    imagenet_labels = json.load(f)

# In[model]:

model = models.vgg16(pretrained=True)
for param in model.parameters():
    param.requires_grad = False


def predict(img):
    pred_raw = model(img)
    pred = F.softmax(pred_raw)
    _,indices = torch.topk(pred,k=1)
    for ind in indices.data.numpy().ravel():
        print ("%.2f%% , class: %s , %s" %(100*pred.data[0][ind],str(ind),imagenet_labels[ind]))


# In[image ]:

peppers = imread("dog.jpg")
img =  imresize(peppers,(224,224))
imgtensor = transform(img)
imgvar = Variable(imgtensor.unsqueeze(0),requires_grad=False)
imgvard = Variable(imgtensor.unsqueeze(0),requires_grad=True)
optimizer = torch.optim.Adam([imgvard], lr = 0.1)
loss_fn =  nn.CrossEntropyLoss() 

label  =  torch.LongTensor(1)
#classify the object as this label
label[0] = 80
label = Variable(label)
eps = 2/255.0

#%% 
Nepochs = 50
print ("Starting ...........",predict(imgvar))

for epoch in range(Nepochs):     
    optimizer.zero_grad()
    pred_raw = model(imgvard)
    loss  =  loss_fn(pred_raw,label)

    diff = imgvard.data - imgvar.data
    imgvard.data = torch.clamp(torch.abs(diff),max=eps) + imgvar.data

    loss.backward()
    optimizer.step()

    print('epoch: {}/{}, loss: {}'.format(
                epoch + 1,Nepochs, loss.data[0]))
    predict(imgvard)
print('Finished Training')

#%%
imshow(imgvard)


#%%
plt.figure()
diffimg = diff[0].numpy()
diffimg = diffimg.transpose((1,2,0))
plt.imshow(diffimg)
Run Code Online (Sandbox Code Playgroud)


Man*_*rya 4

我将尝试以简单的方式讨论对抗性示例。基本上,对于属于某个类 C_1 的给定示例,我们希望通过添加小值 r 来修改此输入,使其在视觉上不会发生太大变化,但以非常高的置信度分类到另一个类 C_2 。为此,您需要优化该功能:在此输入图像描述

因此,理想情况下,我们希望 r 非常小,这是通过对 r (方程的第一部分)进行 L1 正则化获得的。第二项损失是将输入 x + r 分类到新目标类 C_2 的损失。对于损失优化时的每次迭代,

                        x_t = x_(t-1) + r
Run Code Online (Sandbox Code Playgroud)

还要确保 x+r 在 x 的某个范围内,即允许 x 在某个非常小的范围内(即可能为 0.0001)变化。这将给出 x_t,它是 x 的对抗样本。我知道这很令人困惑,但这就是实现上述方程所需的全部。希望这可以帮助。