如何将 Keras ImageDataGenerator 转换为 Numpy 数组?

Nav*_*vin 2 numpy deep-learning conv-neural-network keras tensorflow

我正在研究 CNN 模型,我很好奇如何将 datagen.flow_from_directory() 给出的输出转换为凹凸数组。datagen.flow_from_directory() 的格式是目录迭代器。

除了 ImageDataGenerator 之外,还有其他方法可以从目录中获取数据。

img_width = 150
img_height = 150

datagen = ImageDataGenerator(rescale=1/255.0, validation_split=0.2)

train_data_gen =  directory='/content/xray_dataset_covid19',
                                             target_size = (img_width, img_height),
                                             class_mode='binary',
                                             batch_size=16,
                                             subset='training')

vali_data_gen = datagen.flow_from_directory(directory='/content/xray_dataset_covid19',
                                             target_size = (img_width, img_height),
                                             class_mode='binary',
                                             batch_size=16,
                                             subset='validation')
Run Code Online (Sandbox Code Playgroud)

bsq*_*are 5

第一种方法:

import numpy as np    

data_gen = ImageDataGenerator(rescale = 1. / 255)

data_generator = datagen.flow_from_directory(
    data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical')
data_list = []
batch_index = 0

while batch_index <= data_generator.batch_index:
    data = data_generator.next()
    data_list.append(data[0])
    batch_index = batch_index + 1

# now, data_array is the numeric data of whole images
data_array = np.asarray(data_list)
Run Code Online (Sandbox Code Playgroud)

或者,您可以自己使用PILnumpy处理图像:

from PIL import Image
import numpy as np

def image_to_array(file_path):
    img = Image.open(file_path)
    img = img.resize((img_width,img_height))
    data = np.asarray(img,dtype='float32')
    return data
    # now data is a tensor with shape(width,height,channels) of a single image
Run Code Online (Sandbox Code Playgroud)

第二种方法:您应该使用ImageDataGenerator.flow,它numpy直接接受数组。这取代了flow_from_directory调用,使用生成器的所有其他代码应该是相同的