如何在 MATLAB 中加载 MNIST 数字和标签数据？

Question

如何在 MATLAB 中加载 MNIST 数字和标签数据？

SKM*_*SKM 3 matlab image image-processing mnist

我正在尝试运行链接中给出的代码

https://github.com/bd622/DiscretHashing

离散散列是一种用于近似最近邻搜索的降维方法。我想在http://yann.lecun.com/exdb/mnist/ 中提供的 MNIST 数据库上加载实现。我已经从压缩的 gz 格式中提取了文件。

问题 1：

使用Reading MNIST Image Database binary file in MATLAB中提供的读取MNIST数据库的方案

我收到以下错误：

Error using fread
Invalid file identifier.  Use fopen to generate a valid file identifier.

Error in Reading (line 7)
A = fread(fid, 1, 'uint32');

Run Code Online (Sandbox Code Playgroud)

这是代码：

clear all;
close all;

%//Open file
fid = fopen('t10k-images-idx3-ubyte', 'r');

A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));

%//For each image, store into an individual cell
imageCellArray = cell(1, totalImages);
for k = 1 : totalImages
    %//Read in numRows*numCols pixels at a time
    A = fread(fid, numRows*numCols, 'uint8');
    %//Reshape so that it becomes a matrix
    %//We are actually reading this in column major format
    %//so we need to transpose this at the end
    imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end

%//Close the file
fclose(fid);

Run Code Online (Sandbox Code Playgroud)

更新：问题 1 已解决，修改后的代码为

clear all;
close all;

%//Open file
fid = fopen('t10k-images.idx3-ubyte', 'r');

A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));

%//Read in total number of images
%//A = fread(fid, 4, 'uint8');
%//totalImages = sum(bitshift(A', [24 16 8 0]));

%//OR
A = fread(fid, 1, 'uint32');
totalImages = swapbytes(uint32(A));

%//Read in number of rows
%//A = fread(fid, 4, 'uint8');
%//numRows = sum(bitshift(A', [24 16 8 0]));

%//OR
A = fread(fid, 1, 'uint32');
numRows = swapbytes(uint32(A));

%//Read in number of columns
%//A = fread(fid, 4, 'uint8');
%//numCols = sum(bitshift(A', [24 16 8 0]));

%// OR
A = fread(fid, 1, 'uint32');
numCols = swapbytes(uint32(A));

for k = 1 : totalImages
    %//Read in numRows*numCols pixels at a time
    A = fread(fid, numRows*numCols, 'uint8');
    %//Reshape so that it becomes a matrix
    %//We are actually reading this in column major format
    %//so we need to transpose this at the end
    imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end

%//Close the file
fclose(fid);

Run Code Online (Sandbox Code Playgroud)

问题 2：

我无法理解如何在代码中应用 MNIST 的 4 个文件。代码包含变量

traindata = double(traindata);
testdata = double(testdata);

Run Code Online (Sandbox Code Playgroud)

我如何准备 MNIST 数据库以便我可以申请实施？

更新：我实施了解决方案，但我不断收到此错误

Error using fread
Invalid file identifier.  Use fopen to generate a valid file identifier.

Error in mnist_parse (line 11)
A = fread(fid1, 1, 'uint32');

Run Code Online (Sandbox Code Playgroud)

这些是文件

demo.m % 这是调用函数读入 MNIST 数据的主文件

clear all
clc
[Trainimages, Trainlabels] = mnist_parse('C:\Users\Desktop\MNIST\train-images-idx3-ubyte', 'C:\Users\Desktop\MNIST\train-labels-idx1-ubyte');

[Testimages, Testlabels] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');

k=5;
digit = images(:,:,k);
lbl = label(k);

Run Code Online (Sandbox Code Playgroud)

 function [images, labels] = mnist_parse(path_to_digits, path_to_labels)

% Open files
fid1 = fopen(path_to_digits, 'r');

% The labels file
fid2 = fopen(path_to_labels, 'r');

% Read in magic numbers for both files
A = fread(fid1, 1, 'uint32');
magicNumber1 = swapbytes(uint32(A)); % Should be 2051
fprintf('Magic Number - Images: %d\n', magicNumber1);

A = fread(fid2, 1, 'uint32');
magicNumber2 = swapbytes(uint32(A)); % Should be 2049
fprintf('Magic Number - Labels: %d\n', magicNumber2);

% Read in total number of images
% Ensure that this number matches with the labels file
A = fread(fid1, 1, 'uint32');
totalImages = swapbytes(uint32(A));
A = fread(fid2, 1, 'uint32');
if totalImages ~= swapbytes(uint32(A))
    error('Total number of images read from images and labels files are not the same');
end
fprintf('Total number of images: %d\n', totalImages);

% Read in number of rows
A = fread(fid1, 1, 'uint32');
numRows = swapbytes(uint32(A));

% Read in number of columns
A = fread(fid1, 1, 'uint32');
numCols = swapbytes(uint32(A));

fprintf('Dimensions of each digit: %d x %d\n', numRows, numCols);

% For each image, store into an individual slice
images = zeros(numRows, numCols, totalImages, 'uint8');
for k = 1 : totalImages
    % Read in numRows*numCols pixels at a time
    A = fread(fid1, numRows*numCols, 'uint8');

    % Reshape so that it becomes a matrix
    % We are actually reading this in column major format
    % so we need to transpose this at the end
    images(:,:,k) = reshape(uint8(A), numCols, numRows).';
end

% Read in the labels
labels = fread(fid2, totalImages, 'uint8');

% Close the files
fclose(fid1);
fclose(fid2);

end

Run Code Online (Sandbox Code Playgroud)

Answer 1

ray*_*ica 5

我是你提到的方法#1 的原作者。读取训练数据和测试标签的过程非常简单。在读取图像方面，您上面显示的代码可以完美地读取文件，并且是元胞数组格式。但是，您缺少读取文件内图像、行和列的数量。请注意，此文件的 MNIST 格式采用以下方式。左列是您引用的相对于开头的字节偏移量：

[offset] [type]          [value]          [description]
0000     32 bit integer  0x00000803(2051) magic number
0004     32 bit integer  60000            number of images
0008     32 bit integer  28               number of rows
0012     32 bit integer  28               number of columns
0016     unsigned byte   ??               pixel
0017     unsigned byte   ??               pixel
........
xxxx     unsigned byte   ??               pixel

Run Code Online (Sandbox Code Playgroud)

前四个字节是一个幻数：2051 以确保您正确读取文件。接下来的四个字节表示图像的总数，然后接下来的四个字节是行，最后接下来的四个字节是列。应该有 60000 张大小为 28 行 x 28 列的图像。在此之后，像素以行主要格式交错，因此您必须遍历 28 x 28 像素系列并存储它们。在这种情况下，我将它们存储在一个元胞数组中，并且这个元胞数组中的每个元素都是一位。同样的格式也适用于测试数据，但有 10000 张图像。

至于实际的标签，它的格式大致相同，但有一些细微的差异：

[offset] [type]          [value]          [description]
0000     32 bit integer  0x00000801(2049) magic number (MSB first)
0004     32 bit integer  60000            number of items
0008     unsigned byte   ??               label
0009     unsigned byte   ??               label
........
xxxx     unsigned byte   ??               label

Run Code Online (Sandbox Code Playgroud)

前四个字节是一个幻数：2049，然后第二组四个字节告诉你有多少标签，最后数据集中每个对应的数字正好有 1 个字节。测试数据也是相同的格式，但有 10000 个标签。因此，一旦您读入标签集中必要的数据，您只需fread调用一次并确保数据是无符号 8 位整数即可读入其余标签。

现在您必须使用的原因swapbytes是因为 MATLAB 将以 little-endian 格式读入数据，这意味着首先读入一组字节中的最低有效字节。完成后，您swapbytes可以重新排列此顺序。

因此，我已经为您修改了此代码，使其成为一个实际函数，它接受一组两个字符串：数字图像文件的完整路径和数字的完整路径。我还更改了代码，使图像是 3D 数字矩阵，而不是元胞数组，以便更快地处理。请注意，当您开始读取实际图像数据时，每个像素都是无符号 8 位整数，因此无需进行任何字节交换。这仅在一次fread调用中读取多个字节时才需要：

function [images, labels] = mnist_parse(path_to_digits, path_to_labels)

% Open files
fid1 = fopen(path_to_digits, 'r');

% The labels file
fid2 = fopen(path_to_labels, 'r');

% Read in magic numbers for both files
A = fread(fid1, 1, 'uint32');
magicNumber1 = swapbytes(uint32(A)); % Should be 2051
fprintf('Magic Number - Images: %d\n', magicNumber1);

A = fread(fid2, 1, 'uint32');
magicNumber2 = swapbytes(uint32(A)); % Should be 2049
fprintf('Magic Number - Labels: %d\n', magicNumber2);

% Read in total number of images
% Ensure that this number matches with the labels file
A = fread(fid1, 1, 'uint32');
totalImages = swapbytes(uint32(A));
A = fread(fid2, 1, 'uint32');
if totalImages ~= swapbytes(uint32(A))
    error('Total number of images read from images and labels files are not the same');
end
fprintf('Total number of images: %d\n', totalImages);

% Read in number of rows
A = fread(fid1, 1, 'uint32');
numRows = swapbytes(uint32(A));

% Read in number of columns
A = fread(fid1, 1, 'uint32');
numCols = swapbytes(uint32(A));

fprintf('Dimensions of each digit: %d x %d\n', numRows, numCols);

% For each image, store into an individual slice
images = zeros(numRows, numCols, totalImages, 'uint8');
for k = 1 : totalImages
    % Read in numRows*numCols pixels at a time
    A = fread(fid1, numRows*numCols, 'uint8');

    % Reshape so that it becomes a matrix
    % We are actually reading this in column major format
    % so we need to transpose this at the end
    images(:,:,k) = reshape(uint8(A), numCols, numRows).';
end

% Read in the labels
labels = fread(fid2, totalImages, 'uint8');

% Close the files
fclose(fid1);
fclose(fid2);

end

Run Code Online (Sandbox Code Playgroud)

要调用此函数，只需指定图像文件和标签文件的路径。假设您在文件所在的同一目录中运行此文件，您将对训练图像执行以下操作之一：

[images, labels] = mnist_parse('train-images-idx3-ubyte', 'train-labels-idx1-ubyte');

Run Code Online (Sandbox Code Playgroud)

此外，您将对测试图像执行以下操作：

[images, labels] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');

Run Code Online (Sandbox Code Playgroud)

要访问第kth 位数字，您只需执行以下操作：

digit = images(:,:,k);

Run Code Online (Sandbox Code Playgroud)

k第 th 位对应的标签是：

lbl = label(k);

Run Code Online (Sandbox Code Playgroud)

为了最终将这些数据转换为我在 Github 上看到的代码可接受的格式，他们假设行对应于训练示例，列对应于特征。如果您希望使用这种格式，只需重新调整数据的形状，使图像像素分布在各列上。

因此，只需这样做：

[trainingdata, traingnd] = mnist_parse('train-images-idx3-ubyte', 'train-labels-idx1-ubyte');
trainingdata = double(reshape(trainingdata, size(trainingdata,1)*size(trainingdata,2), []).');
traingnd = double(traingnd);

[testdata, testgnd] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');
testdata = double(reshape(testdata, size(testdata,1)*size(testdata_data,2), []).');
testgnd = double(testgnd);

Run Code Online (Sandbox Code Playgroud)

以上使用与脚本中相同的变量，因此您应该能够插入它并且它应该可以工作。第二行重塑矩阵，使每个数字都在列中，但我们需要将其转置，以便每个数字都在列中。我们还需要double强制转换为 Github 代码正在做的事情。相同的逻辑应用于测试数据。另请注意，我已经明确地将训练和测试标签转换double为确保在您决定对这些数据使用的任何算法中的最大兼容性。

快乐的数字黑客！

归档时间：	9 年，4 月前
查看次数：	15119 次
最近记录：	9 年，4 月前