bad*_*ddy 3 python model-validation keras tensorflow data-augmentation
我正在使用 keras 实现 CNN 来执行图像分类,并且我使用 .fit_generator() 方法来训练模型,直到验证停止条件为止我使用了下一个代码:
history_3conv = cnn3.fit_generator(train_data,steps_per_epoch = train_data.n // 98, callbacks = [es,ckpt_3Conv],
validation_data = valid_data, validation_steps = valid_data.n // 98,epochs=50)
Run Code Online (Sandbox Code Playgroud)
停止前的最后两个纪元是下一个:
如图所示,最后的训练准确率为 0.91。然而,当我使用model.evaluate()方法来评估训练、测试和验证集时,我得到了下一个结果:
所以,我的问题是:为什么我有两个不同的值?
我应该使用吗evaluate_generator()?或者我应该修复seed知道flow_from_directory()要执行数据增强我使用了下一个代码:
trdata = ImageDataGenerator(rotation_range=90,horizontal_flip=True)
vldata = ImageDataGenerator()
train_data = trdata.flow(x_train,y_train,batch_size=98)
valid_data = vldata.flow(x_valid,y_valid,batch_size=98)
Run Code Online (Sandbox Code Playgroud)
此外,我知道use_multiprocessing=Falsefit_generator 中的设置会让我显着减慢训练速度。那么你认为最好的解决方案是什么
小智 8
model.fit()和model.evaluate()是要走的路,因为model.fit_generator和model.evaluate_generator已被弃用。
training和数据validation是生成器产生的增强数据。所以你的准确度会有一些变化。如果您在of以及 for or中使用了非增强validation或test数据,那么准确性不会有任何变化。validation_datafit_generatormodel.evaluate()model.evaluate_generator
下面是我运行了一个时期的简单的猫和狗分类程序 -
val_data_gen.reset()。但应该没有必要,因为我们还没有进行任何增强。model.evaluate使用和 以及评估验证数据的准确性model.evaluate_generator。在纪元结束后计算的验证准确度与使用model.evaluate和计算的准确度model.evaluate_generator相匹配。
代码:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
import os
import numpy as np
import matplotlib.pyplot as plt
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')
train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')
train_cats_dir = os.path.join(train_dir, 'cats') # directory with our training cat pictures
train_dogs_dir = os.path.join(train_dir, 'dogs') # directory with our training dog pictures
validation_cats_dir = os.path.join(validation_dir, 'cats') # directory with our validation cat pictures
validation_dogs_dir = os.path.join(validation_dir, 'dogs') # directory with our validation dog pictures
num_cats_tr = len(os.listdir(train_cats_dir))
num_dogs_tr = len(os.listdir(train_dogs_dir))
num_cats_val = len(os.listdir(validation_cats_dir))
num_dogs_val = len(os.listdir(validation_dogs_dir))
total_train = num_cats_tr + num_dogs_tr
total_val = num_cats_val + num_dogs_val
batch_size = 1
epochs = 1
IMG_HEIGHT = 150
IMG_WIDTH = 150
train_image_generator = ImageDataGenerator(rescale=1./255,brightness_range=[0.5,1.5]) # Generator for our training data
validation_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our validation data
train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='binary')
val_data_gen = validation_image_generator.flow_from_directory(batch_size=batch_size,
directory=validation_dir,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='binary')
model = Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(512, activation='relu'),
Dense(1)
])
optimizer = 'SGD'
model.compile(optimizer=optimizer,
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit_generator(
train_data_gen,
steps_per_epoch=total_train // batch_size,
epochs=epochs,
validation_data=val_data_gen,
validation_steps=total_val // batch_size)
from sklearn.metrics import confusion_matrix
# Reset
val_data_gen.reset()
# Evaluate on Validation data
scores = model.evaluate(val_data_gen)
print("%s%s: %.2f%%" % ("evaluate ",model.metrics_names[1], scores[1]*100))
scores = model.evaluate_generator(val_data_gen)
print("%s%s: %.2f%%" % ("evaluate_generator ",model.metrics_names[1], scores[1]*100))
Run Code Online (Sandbox Code Playgroud)
输出:
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
2000/2000 [==============================] - 74s 37ms/step - loss: 0.6932 - accuracy: 0.5025 - val_loss: 0.6815 - val_accuracy: 0.5000
1000/1000 [==============================] - 11s 11ms/step - loss: 0.6815 - accuracy: 0.5000
evaluate accuracy: 50.00%
evaluate_generator accuracy: 50.00%
Run Code Online (Sandbox Code Playgroud)