Keras fit_generator 和 fit 结果不同

Sud*_*aja 6 generator python-3.x deep-learning keras tensorflow

我正在使用面部图像数据集训练卷积神经网络。该数据集有 10,000 张尺寸为 700 x 700 的图像。我的模型有 12 层。我正在使用生成器函数将图像读入 Keras fit_generator 函数,如下所示。

train_file_names ==> 包含训练实例文件名的 Python 列表
train_class_labels ==> one-hot 编码类标签的 Numpy 数组([0, 1, 0], [0, 0, 1] 等)
train_data ==> Numpy 数组训练实例
train_steps_epoch ==> 16 (批量大小为 400,我有 6400 个实例进行训练。因此,单次遍历整个数据集需要 16 次迭代)
batch_size ==> 400
requests_made ==> 当生成器到达训练实例结束时,它会重置索引以在下一个纪元中从第一个索引加载数据。

我将此生成器作为参数传递给 keras 'fit_generator' 函数,以便为每个时期生成新一批数据。

val_data, val_class_labels ==> 验证数据 numpy 数组
epochs ==> epoch 数

使用 Keras fit_generator

model.fit_generator(generator=train_generator, steps_per_epoch=train_steps_per_epoch, epochs=epochs, use_multiprocessing=False, validation_data=[val_data, val_class_labels], verbose=True, callbacks=[history, model_checkpoint], shuffle=True, initial_epoch=0) 
Run Code Online (Sandbox Code Playgroud)

代码

def train_data_generator(self):     
    index_start = index_end = 0 
    temp = 0
    calls_made = 0

    while temp < train_steps_per_epoch:
        index_end = index_start + batch_size
        for temp1 in range(index_start, index_end):
            index = 0
            # Read image
            img = cv2.imread(str(TRAIN_DIR / train_file_names[temp1]), cv2.IMREAD_GRAYSCALE).T
            train_data[index]  = cv2.resize(img, (self.ROWS, self.COLS), interpolation=cv2.INTER_CUBIC)
            index += 1       
        yield train_data, self.train_class_labels[index_start:index_end]
        calls_made += 1
        if calls_made == train_steps_per_epoch:
            index_start = 0
            temp = 0
            calls_made = 0
        else:
            index_start = index_end
            temp += 1  
        gc.collect()
Run Code Online (Sandbox Code Playgroud)

fit_generator 的输出

Epoch 86/300
16/16 [================================] - 16s 1s/step - 损失:1.5739 - acc : 0.2991 - val_loss: 12.0076 - val_acc: 0.2110
Epoch 87/300
16/16 [================================] - 16s 1s/步 - 损失:1.6010 - acc:0.2549 - val_loss:11.6689 - val_acc:0.2016
Epoch 88/300
16/16 [====================== ========] - 16s 1s/步 - 损失:1.5750 - acc:0.2391 - val_loss:10.2663 - val_acc:0.2004
Epoch 89/300
16/16 [============ ==================] - 16s 1s/步 - 损失:1.5526 - acc:0.2641 - val_loss:11.8809 - val_acc:0.2249
Epoch 90/300
16/16 [== ============================] - 16s 1s/步 - 损失:1.5867 - acc:0.2602 - val_loss:12.0392 - val_acc:0.2010
Epoch 91/300
16/16 [================================] - 16s 1s/step - 损失:1.5524 - acc :0.2609 - val_loss:12.0254 - val_acc:0.2027

我的问题是,当将“fit_generator”与上述生成器函数一起使用时,我的模型损失根本没有改善,并且验证准确性非常差。但是,当我使用下面的 keras“fit”函数时,模型损失会减少,验证准确性会更好。

使用 Keras 拟合函数而不使用生成器

model.fit(self.train_data, self.train_class_labels, batch_size=self.batch_size, epochs=self.epochs, validation_data=[self.val_data, self.val_class_labels], verbose=True, callbacks=[history, model_checkpoint])    
Run Code Online (Sandbox Code Playgroud)

使用拟合函数训练时的输出

Epoch 25/300
6400/6400 [================================] - 20s 3ms/步 - 损耗:0.0207 - acc : 0.9939 - val_loss: 4.1009 - val_acc: 0.4916
纪元 26/300
6400/6400 [================================] - 20s 3ms/步 - 损耗:0.0197 - acc:0.9948 - val_loss:2.4758 - val_acc:0.5568
Epoch 27/300
6400/6400 [======================== ========] - 20s 3ms/步 - 损耗:0.0689 - acc:0.9800 - val_loss:1.2843 - val_acc:0.7361
Epoch 28/300
6400/6400 [============ ==================] - 20s 3ms/步 - 损失:0.0207 - acc:0.9947 - val_loss:5.6979 - val_acc:0.4560
Epoch 29/300
6400/6400 [== ============================] - 20s 3ms/步 - 损耗:0.0353 - acc:0.9908 - val_loss:1.0801 - val_acc:0.7817
Epoch 30/300
6400/6400 [================================] - 20s 3ms/步 - 损耗:0.0362 - acc : 0.9896 - val_loss: 3.7851 - val_acc: 0.5173
纪元 31/300
6400/6400 [================================] - 20s 3ms/步 - 损耗:0.0481 - acc:0.9896 - val_loss:1.1152 - val_acc:0.7795
Epoch 32/300
6400/6400 [======================== ========] - 20s 3ms/步 - 损耗:0.0106 - acc:0.9969 - val_loss:1.4803 - val_acc:0.7372

Cer*_*rno 0

您必须确保您的数据生成器在纪元之间对数据进行洗牌。我建议您在循环之外创建一个可能的索引列表,使用 random.shuffle 对其进行随机化,然后在循环内对其进行迭代。

来源: https: //github.com/keras-team/keras/issues/2389和自己的经验。