我已经从MNIST数据集中以.jpg格式下载了一些样本图像。现在,我正在加载这些图像以测试我的预训练模型。
# transforms to apply to the data
trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
# MNIST dataset
test_dataset = dataset.ImageFolder(root=DATA_PATH, transform=trans)
# Data loader
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)
Run Code Online (Sandbox Code Playgroud)
这里DATA_PATH包含带有示例图像的子文件夹。
这是我的网络定义
# Convolutional neural network (two convolutional layers)
class ConvNet(nn.Module):
def __init__(self):
super(ConvNet, self).__init__()
self.network2D = nn.Sequential(
nn.Conv2d(1, 32, kernel_size=5, stride=1, padding=2),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Conv2d(32, 64, kernel_size=5, stride=1, padding=2),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2))
self.network1D = nn.Sequential(
nn.Dropout(),
nn.Linear(7 * 7 * 64, 1000),
nn.Linear(1000, 10))
def forward(self, x):
out = self.network2D(x)
out = out.reshape(out.size(0), -1)
out = self.network1D(out)
return out
Run Code Online (Sandbox Code Playgroud)
这是我的推论部分
# Test the model
model = torch.load("mnist_weights_5.pth.tar")
model.eval()
for images, labels in test_loader:
outputs = model(images.cuda())
Run Code Online (Sandbox Code Playgroud)
运行此代码时,出现以下错误:
RuntimeError: Given groups=1, weight of size [32, 1, 5, 5], expected input[1, 3, 28, 28] to have 1 channels, but got 3 channels instead
Run Code Online (Sandbox Code Playgroud)
我了解这些图像将以3个通道(RGB)的形式加载。那么,如何在中将它们转换为单通道dataloader?
更新:我更改transforms为包括Grayscale选项
trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)), transforms.Grayscale(num_output_channels=1)])
Run Code Online (Sandbox Code Playgroud)
但是现在我得到这个错误
TypeError: img should be PIL Image. Got <class 'torch.Tensor'>
Run Code Online (Sandbox Code Playgroud)
小智 5
当使用ImageFolderclass并且没有自定义加载器时,pytorch使用PIL加载图像并将其转换为RGB。如果Torchvision图像后端为PIL,则为默认加载程序:
def pil_loader(path):
with open(path, 'rb') as f:
img = Image.open(f)
return img.convert('RGB')
您可以在转换中使用Torchvision的Grayscale功能。它将3通道RGB图像转换为1通道灰度。在https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.transforms.Grayscale中找到有关此内容的更多信息。
下面是示例代码,
import torchvision as tv
import numpy as np
import torch.utils.data as data
dataDir = 'D:\\general\\ML_DL\\datasets\\CIFAR'
trainTransform = tv.transforms.Compose([tv.transforms.Grayscale(num_output_channels=1),
tv.transforms.ToTensor(),
tv.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainSet = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
dataloader = data.DataLoader(trainSet, batch_size=1, shuffle=False, num_workers=0)
images, labels = iter(dataloader).next()
print (images.size())
Run Code Online (Sandbox Code Playgroud)
Har*_*han -2
我找到了解决这个问题的一个非常简单的方法。张量所需的维度为[1,1,28,28],而输入张量的形式为[1,3,28,28]。所以我只需要从中读取 1 个通道
images = images[:,0,:,:]
Run Code Online (Sandbox Code Playgroud)
这给了我一个形式的张量[1,28,28]。现在我需要将其转换为形式的张量[1,1,28,28]。可以这样做
images = images.unsqueeze(0)
Run Code Online (Sandbox Code Playgroud)
所以把上面两行放在一起,预测部分的代码可以这样写
for images, labels in test_loader:
images = images[:,0,:,:].unsqueeze(0) ## Extract single channel and reshape the tensor
outputs = model(images.cuda())
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
6474 次 |
| 最近记录: |