如何以支持 autograd 的方式围绕其中心旋转 PyTorch 图像张量？

Question

如何以支持 autograd 的方式围绕其中心旋转 PyTorch 图像张量？

Pro*_*Gov 5 python rotation image-rotation rotational-matrices pytorch

我想围绕它的中心随机旋转图像张量（B、C、H、W）（我认为是 2d 旋转？）。我想避免使用 NumPy 和 Kornia，这样我基本上只需要从 torch 模块导入。我也没有使用torchvision.transforms，因为我需要它与 autograd 兼容。本质上，我正在尝试为 DeepDream 等torchvision.transforms.RandomRotation()可视化技术创建一个 autograd 兼容版本（因此我需要尽可能避免伪影）。

import torch
import math
import random
import torchvision.transforms as transforms
from PIL import Image


# Load image
def preprocess_simple(image_name, image_size):
    Loader = transforms.Compose([transforms.Resize(image_size), transforms.ToTensor()])
    image = Image.open(image_name).convert('RGB')
    return Loader(image).unsqueeze(0)
    
# Save image   
def deprocess_simple(output_tensor, output_name):
    output_tensor.clamp_(0, 1)
    Image2PIL = transforms.ToPILImage()
    image = Image2PIL(output_tensor.squeeze(0))
    image.save(output_name)


# Somehow rotate tensor around it's center
def rotate_tensor(tensor, radians):
    ...
    return rotated_tensor

# Get a random angle within a specified range 
r_degrees = 5
angle_range = list(range(-r_degrees, r_degrees))
n = random.randint(angle_range[0], angle_range[len(angle_range)-1])

# Convert angle from degrees to radians
ang_rad = angle * math.pi / 180


# test_tensor = preprocess_simple('path/to/file', (512,512))
test_tensor = torch.randn(1,3,512,512)


# Rotate input tensor somehow
output_tensor = rotate_tensor(test_tensor, ang_rad)


# Optionally use this to check rotated image
# deprocess_simple(output_tensor, 'rotated_image.jpg')

Run Code Online (Sandbox Code Playgroud)

我试图完成的一些示例输出：

Answer 1

Gil*_*sky 10

所以网格生成器和采样器是空间变换器（JADERBERG、Max 等）的子模块。这些子模块是不可训练的，它们让您可以应用可学习和不可学习的空间转换。在这里，我theta使用这两个子模块并使用它们通过使用 PyTorch 的函数F.affine_grid和F.affine_sample（这些函数分别是生成器和采样器的实现）来旋转图像：

import torch
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt

def get_rot_mat(theta):
    theta = torch.tensor(theta)
    return torch.tensor([[torch.cos(theta), -torch.sin(theta), 0],
                         [torch.sin(theta), torch.cos(theta), 0]])


def rot_img(x, theta, dtype):
    rot_mat = get_rot_mat(theta)[None, ...].type(dtype).repeat(x.shape[0],1,1)
    grid = F.affine_grid(rot_mat, x.size()).type(dtype)
    x = F.grid_sample(x, grid)
    return x


#Test:
dtype =  torch.cuda.FloatTensor if torch.cuda.is_available() else torch.FloatTensor
#im should be a 4D tensor of shape B x C x H x W with type dtype, range [0,255]:
plt.imshow(im.squeeze(0).permute(1,2,0)/255) #To plot it im should be 1 x C x H x W
plt.figure()
#Rotation by np.pi/2 with autograd support:
rotated_im = rot_img(im, np.pi/2, dtype) # Rotate image by 90 degrees.
plt.imshow(rotated_im.squeeze(0).permute(1,2,0)/255)

Run Code Online (Sandbox Code Playgroud)

在上面的例子中，假设我们把我们的形象，im，，是一只穿着裙子的跳舞猫：

rotated_im 将是一只穿着裙子的逆时针旋转 90 度旋转舞猫：

如果我们rot_img使用thetaeqauls调用，这就是我们得到的结果np.pi/4：

最好的部分是它可以区分输入并具有 autograd 支持！万岁！

谢谢，您的代码运行得非常好！但我应该如何处理跨批次维度堆叠的多个图像？我应该使用相同的旋转矩阵单独执行每个操作，还是可以稍微修改一下以在没有 for 语句的情况下工作？ (2认同)
我很高兴:)，我刚刚更新了我的代码，以便它也可以跨批次维度工作。您需要做的就是在 rot_img 中使用 .repeat(x.shape[0],1,1)``` 来重复旋转矩阵，使其具有与 x 相同的批量维度。 (2认同)

归档时间：	5 年，4 月前
查看次数：	5293 次
最近记录：	5 年，4 月前