keras fit_generator中的nb_epoch,samples_per_epoch和nb_val_samples的标准?

Aze*_*lah 2 python conv-neural-network keras tensorflow

我创建了一个简单的猫狗图像分类(卷积神经网络).拥有每班7,000人的培训数据和每班5,500人的验证数据.

我的问题是我的系统没有完成所有时代.如果有人能够解释选择nb_epoch,samples_per_epoch和nb_val_samples值的比例或标准,以获得最大限度的训练和验证数据,我将非常感激.

以下是我的代码:

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.callbacks import EarlyStopping
import numpy as np
from keras.preprocessing import image
from keras.utils.np_utils import probas_to_classes

model=Sequential()
model.add(Convolution2D(32, 5,5, input_shape=(28,28,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Convolution2D(32,3,3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(2))
model.add(Activation('softmax'))

train_datagen=ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen=ImageDataGenerator(rescale=1./255)

train_generator=train_datagen.flow_from_directory(
r'F:\data\train',
target_size=(28,28),
classes=['dog','cat'],
batch_size=10,
class_mode='categorical',
shuffle=True)

validation_generator=test_datagen.flow_from_directory(
r'F:\data\validation',
target_size=(28, 28),
classes=['dog','cat'],
batch_size=10,
class_mode='categorical',
shuffle=True)

model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
early_stopping=EarlyStopping(monitor='val_loss', patience=2)
model.fit_generator(train_generator,verbose=2, samples_per_epoch=650, nb_epoch=100, validation_data=validation_generator, callbacks=[early_stopping],nb_val_samples=550)

json_string=model.to_json()
open(r'F:\data\mnistcnn_arc.json','w').write(json_string)
model.save_weights(r'F:\data\mnistcnn_weights.h5')
score=model.evaluate_generator(validation_generator, 1000)

print('Test score:', score[0])
print('Test accuracy:', score[1])

img_path = 'F:/abc.jpg'
img = image.load_img(img_path, target_size=(28, 28))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)

y_proba = model.predict(x)
y_classes = probas_to_classes(y_proba)
print(train_generator.class_indices)
print(y_classes)
Run Code Online (Sandbox Code Playgroud)

Dav*_*sia 7

samples_per_epoch通常设置为:

samples_per_epoch=train_generator.nb_samples
Run Code Online (Sandbox Code Playgroud)

通过这种方式,您可以确保每个时期都能看到大量样本等于训练集的大小.这意味着您在每个时代都能看到所有的训练样本.


nb_epoch非常适合你.它确定迭代定义的数字的次数samples_per_epoch.

举个例子,你的代码现在你的模型是"看到" (nb_epoch * samples_per_epoch)图像,在这种情况下是65000个图像.


nb_val_samples确定在完成每个纪元后评估模型的验证样本数量.这取决于你.通常的事情是设置:

nb_val_samples=validation_generator.nb_samples
Run Code Online (Sandbox Code Playgroud)

为了在完整的验证集上评估您的模型.


batch_size确定同时向您的gpu(或cpu)提供的图像数量.哑巴的规则是设置batch_size你的gpu的memmory允许的最大值.理想的batch_size是现在研究的活跃领域,但通常更大的batch_size将更好地工作.