Keras/Tensorflow 中涉及梯度的自定义损失函数

Question

Keras/Tensorflow 中涉及梯度的自定义损失函数

我已经看到这个问题之前已经被问过几次，但没有任何解决方案。我的问题很简单：我想实现一个损失函数，它计算预测梯度和真值之间的 MSE（最终转向更复杂的损失函数）。

我定义了以下两个函数：

def my_loss(y_true, y_pred, x):
    dydx = K.gradients(y_pred, x)
    return K.mean(K.square(dydx - y_true), axis=-1)

def my_loss_function(x):
    def gradLoss(y_true, y_pred):
        return my_loss(y_true, y_pred, x)
    return gradLoss

Run Code Online (Sandbox Code Playgroud)

然后，在我的模型中，我调用

model_loss = my_loss_function(x)
model.compile(optimizer=Adam(lr=0.01),
              loss=model_loss)

Run Code Online (Sandbox Code Playgroud)

但我收到以下错误：

ValueError: An operation has没有任何for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

作为参考，整个代码包含在下面。实现这种损失函数的正确方法是什么？

import tensorflow as tf
from tensorflow import keras

import numpy as np
import math, random
import matplotlib.pyplot as plt

from keras.layers import Input, Dense
from keras.models import Model
from keras.optimizers import Adam

##################################################################################
# set neural network parameters
#     \param [in] NUM_HIDDEN_NODES: neurons per hidden layer
#     \param [in] NUM_EXAMPLES:     total number of examples
#     \param [in] TRAIN_SPLIT:      proportion of examples that are for training
#     \param [in] MINI_BATCH_SIZE:  batch size for optimization step
#     \param [in] NUM_EPOCHS:       iterations for training
##################################################################################
NUM_HIDDEN_NODES = 10
NUM_EXAMPLES = 500
TRAIN_SPLIT = .8
MINI_BATCH_SIZE = 100
NUM_EPOCHS = 400


##################################################################################
# define the function to approximate
#     \param [in] x: list of inputs to evaluate function at
##################################################################################
def my_function(x):
    return np.sin(x)


##################################################################################
# generate training and test data according to TRAIN_SPLIT
#     \param [in] start:    starting point for the data
#     \param [in] end:      ending point for the data
#     \param [out] x_train: data on which the neural network will be trained on
#     \param [out] y_train: data on which the neural network will be trained on
#     \param [out] x_test:  data on which the trained neural network will be 
#                           validated
#     \param [out] y_test:  data on which the trained neural network will be 
#                           validated
##################################################################################
def create_data(start, end):
    x = np.float32(np.random.uniform(start, end, (1, NUM_EXAMPLES))).T
    y = my_function(x)

    train_size = int(NUM_EXAMPLES*TRAIN_SPLIT)
    x_train = x[:train_size]
    x_test = x[train_size:]
    y_train = my_function(x_train)
    y_test = my_function(x_test)

    return (x_train, y_train, x_test, y_test)


from keras import backend as K

def my_loss(y_true, y_pred, x):
    dydx = K.gradients(y_pred, x)
    return K.mean(K.square(dydx - y_true), axis=-1)

def my_loss_function(x):
    def gradLoss(y_true, y_pred):
        return my_loss(y_true, y_pred, x)
    return gradLoss

##################################################################################
# generate the neural network model
##################################################################################
def create_model():
    x = Input(shape=(1, ))

    # hidden layers with tanh activation function
    h1 = Dense(10, activation="tanh")(x)
    h2 = Dense(10, activation="tanh")(h1)
    h3 = Dense(10, activation="tanh")(h2)

    # linear activation layer
    y = Dense(1, activation="linear")(h3)

    model = Model(x, y)

    model_loss = my_loss_function(x)

    model.compile(optimizer=Adam(lr=0.01),
                  loss=model_loss)

    return model


##################################################################################
# routine for training the neural network
#     \param [out] x_train: data on which the neural network will be trained on
#     \param [out] y_train: data on which the neural network will be trained on
#     \param [out] x_test:  data on which the trained neural network will be 
#                           validated
#     \param [out] y_test:  data on which the trained neural network will be 
#                           validated
#     \param [out] model:   the training neural network model
##################################################################################
def train(x_train, y_train, x_test, y_test):
    model = create_model()

    model.fit(x_train, y_train, 
              epochs=NUM_EPOCHS, 
              batch_size=MINI_BATCH_SIZE, 
              validation_data=[x_test, y_test]
              )

    return model

# generate training and test data and train the neural network
x_train, y_train, x_test, y_test = create_data(-2.0*math.pi, 2.0*math.pi)
model = train(x_train, y_train, x_test, y_test)


##################################################################################
# use the neural network model to compute the function at test data
#     \param [in] model: the trained neural network model
#     \param [in] x:     x values at which to test the trained model
##################################################################################
def predict_targets(model, x):
    return model.predict(x)


##################################################################################
# plot the exact data against the predicted data
#     \param [in] x: x data
#     \param [in] y_true: exact y data
#     \param [in] y_pred: neural network prediction
##################################################################################
def plot_predictions(x, y_true, y_pred):
    plt.figure(1)
    plt.plot(x, y_true)
    plt.plot(x, y_pred)
    plt.xlabel('x')
    plt.ylabel('y')
    plt.show()


# plot neural network prediction on the original training data
predictions = predict_targets(model, x_train)
indexes = list(range(len(x_train)))
indexes.sort(key=x_train.__getitem__)
x_train = list(map(x_train.__getitem__, indexes))
y_train = list(map(y_train.__getitem__, indexes))
predictions = list(map(predictions.__getitem__, indexes))
plot_predictions(x_train, y_train, predictions)


# plot neural network prediction on the validation/test data
x = np.linspace(-2*math.pi, 2*math.pi, 1000)
y = my_function(x)
predictions = predict_targets(model, x)
plot_predictions(x, y, predictions)

Run Code Online (Sandbox Code Playgroud)

Answer 1

小智 0

我在github上发现了一个包，它启发我定制训练循环（如此处所述）。我附上了一个示例，它自定义 Sequential 类并添加损失函数梯度（wrt 输入）的平均值作为额外的惩罚。

import tensorflow as tf
from tensorflow import keras

class Custom(keras.Sequential):
    
    def train_step(self, data):
        # Unpack the data. Its structure depends on your model and
        # on what you pass to `fit()`.
        x, y = data        
        
        with tf.GradientTape(persistent=True) as tape:
            tape.watch(x)
            y_pred = self(x, training=True)  # Forward pass
            # Compute the loss value
            # (the loss function is configured in `compile()`)
            loss = self.compiled_loss(y, y_pred, regularization_losses=self.losses)

            loss_grad = tape.gradient(loss, x)
        
            loss = loss + tf.math.reduce_mean(loss_grad)
        
        # Compute gradients
        trainable_vars = self.trainable_variables
        gradients = tape.gradient(loss, trainable_vars)
        # Update weights
        self.optimizer.apply_gradients(zip(gradients, trainable_vars))
        del tape
        # Update metrics (includes the metric that tracks the loss)
        self.compiled_metrics.update_state(y, y_pred)
        # Return a dict mapping metric names to current value
        return {m.name: m.result() for m in self.metrics}

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，9 月前
查看次数：	960 次
最近记录：	5 年，3 月前