我试图在Keras中设置一个非线性回归问题.不幸的是,结果表明过度拟合正在发生.这是代码,
model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)
Run Code Online (Sandbox Code Playgroud)
没有正则化的结果在这里显示没有正则化.与验证相比,训练的平均绝对误差要小得多,并且两者都有固定的间隙,这是过度拟合的标志.
像这样为每一层指定了L2正则化,
model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)
Run Code Online (Sandbox Code Playgroud)
这些结果显示在这里L2正则化结果.测试的MAE接近培训,这很好.然而,训练的MAE很差,为0.03(没有正规化,它在0.0028处低得多). …
我试图了解 dropout 对验证平均绝对误差(非线性回归问题)的影响。
不辍学
辍学率为 0.05
在没有任何 dropout 的情况下,验证损失大于训练损失,如图1所示。我的理解是验证损失应该只比训练损失稍微多一点才能很好地拟合。
小心地,我增加了 dropout,以便验证损失接近训练损失,如2 所示。dropout 仅在训练期间应用,而不在验证期间应用,因此验证损失低于训练损失。
最后,dropout 进一步增加,验证损失再次超过3 中的训练损失。
这三个中哪一个应该被称为合适的?
根据 Marcin Mo?ejko 的回应,我预测了三个测试,如图4所示。“Y”轴显示 RMS 误差而不是 MAE。“没有辍学”的模型给出了最好的结果。
validation machine-learning neural-network deep-learning keras
我正在尝试使用 GridSearchCV 和 KerasRegressor 进行超参数搜索。Keras model.fit 函数本身允许使用历史对象查看“loss”和“val_loss”变量。
使用 GridSearchCV 时是否可以查看 'loss' 和 'val_loss' 变量。
这是我用来进行网格搜索的代码:
model = KerasRegressor(build_fn=create_model_gridsearch, verbose=0)
layers = [[16], [16,8]]
activations = ['relu' ]
optimizers = ['Adam']
param_grid = dict(layers=layers, activation=activations, input_dim=[X_train.shape[1]], output_dim=[Y_train.shape[1]], batch_size=specified_batch_size, epochs=num_of_epochs, optimizer=optimizers)
grid = GridSearchCV(estimator=model, param_grid=param_grid, scoring='neg_mean_squared_error', n_jobs=-1, verbose=1, cv=7)
grid_result = grid.fit(X_train, Y_train)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in sorted(zip(means, stds, params), key=lambda x: …Run Code Online (Sandbox Code Playgroud)