使用keras的多个验证集

Ero*_*ror 12 validation monitoring keras

我正在使用该model.fit()方法训练使用keras的模型.我想使用多个验证集,应该在每个训练时期之后单独验证,以便为每个验证集获得一个损失值.如果可能的话,它们应该在训练期间显示,并且也可以通过keras.callbacks.History()回调返回.

我在考虑这样的事情:

history = model.fit(train_data, train_targets,
                    epochs=epochs,
                    batch_size=batch_size,
                    validation_data=[
                        (validation_data1, validation_targets1), 
                        (validation_data2, validation_targets2)],
                    shuffle=True)
Run Code Online (Sandbox Code Playgroud)

我目前不知道如何实现这一点.是否有可能通过自己编写来实现这一目标Callback?或者你怎么解决这个问题呢?

Ero*_*ror 18

我最终Callback根据History回调编写了自己的解决方案.我不确定这是否是最佳方法,但以下Callback记录了培训和验证集的损失和指标,如History回调以及传递给构造函数的其他验证集的损失和指标.

class AdditionalValidationSets(Callback):
    def __init__(self, validation_sets, verbose=0, batch_size=None):
        """
        :param validation_sets:
        a list of 3-tuples (validation_data, validation_targets, validation_set_name)
        or 4-tuples (validation_data, validation_targets, sample_weights, validation_set_name)
        :param verbose:
        verbosity mode, 1 or 0
        :param batch_size:
        batch size to be used when evaluating on the additional datasets
        """
        super(AdditionalValidationSets, self).__init__()
        self.validation_sets = validation_sets
        for validation_set in self.validation_sets:
            if len(validation_set) not in [2, 3]:
                raise ValueError()
        self.epoch = []
        self.history = {}
        self.verbose = verbose
        self.batch_size = batch_size

    def on_train_begin(self, logs=None):
        self.epoch = []
        self.history = {}

    def on_epoch_end(self, epoch, logs=None):
        logs = logs or {}
        self.epoch.append(epoch)

        # record the same values as History() as well
        for k, v in logs.items():
            self.history.setdefault(k, []).append(v)

        # evaluate on the additional validation sets
        for validation_set in self.validation_sets:
            if len(validation_set) == 3:
                validation_data, validation_targets, validation_set_name = validation_set
                sample_weights = None
            elif len(validation_set) == 4:
                validation_data, validation_targets, sample_weights, validation_set_name = validation_set
            else:
                raise ValueError()

            results = self.model.evaluate(x=validation_data,
                                          y=validation_targets,
                                          verbose=self.verbose,
                                          sample_weight=sample_weights,
                                          batch_size=self.batch_size)

            for i, result in enumerate(results):
                if i == 0:
                    valuename = validation_set_name + '_loss'
                else:
                    valuename = validation_set_name + '_' + self.model.metrics[i-1].__name__
                self.history.setdefault(valuename, []).append(result)
Run Code Online (Sandbox Code Playgroud)

我当时正在使用这样的:

history = AdditionalValidationSets([(validation_data2, validation_targets2, 'val2')])
model.fit(train_data, train_targets,
          epochs=epochs,
          batch_size=batch_size,
          validation_data=(validation_data1, validation_targets1),
          callbacks=[history]
          shuffle=True)
Run Code Online (Sandbox Code Playgroud)

  • 在最新版本的Keras中,`self.model.metrics [i-1]`是一个字符串。__name__不再是必需的(并且会产生错误)。 (3认同)
  • 您会考虑将其制作成 python 包吗?或者,如果你没有时间,我也可以这样做。 (2认同)
  • @LucaCappelletti我之所以没有制作一个程序包,是因为在几乎每个项目中我都略微修改了回调(例如,以支持数据生成器)。将代码从一个项目复制并粘贴到另一个项目中并不是最佳的工作流程,但是它有点容易,而且正如您所猜到的,我没有时间从中制作一个程序包(除非您要等待一个月左右)。如果有时间,请随时使用它制作包装。我将我的GitHub用户名添加到了我的帐户中,以便您可以根据需要向我添加该项目。 (2认同)

Nim*_*hli 5

我在 TensorFlow 2 上对此进行了测试,它有效。您可以在每个时期结束时评估任意数量的验证集:

class MyCustomCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        res_eval_1 = self.model.evaluate(X_test_1, y_test_1, verbose = 0)
        res_eval_2 = self.model.evaluate(X_test_2, y_test_2, verbose = 0)
        print(res_eval_1)
        print(res_eval_2)
Run Code Online (Sandbox Code Playgroud)

然后:

my_val_callback = MyCustomCallback()
# Your model creation code
model.fit(..., callbacks=[my_val_callback])
Run Code Online (Sandbox Code Playgroud)