为什么我的损失函数返回 nan？

Question

为什么我的损失函数返回 nan？

mdo*_*fe1 0 gradient-descent keras tensorflow

因此，我使用 Tensorflow 后端在 Keras 中定义了这个自定义损失函数，以最小化背景提取自动编码器。它应该确保预测 x_hat 不会偏离批次 B0 的预测中值太远。

def ben_loss(x, x_hat):

    B0 = tf_median(tf.transpose(x_hat))
    sigma = tf.reduce_mean(tf.sqrt(tf.abs(x_hat - B0) / 0.4), axis=0)
    # I divide by sigma in the next step. So I add a small float32 to sigma
    # so as to prevent background_term from becoming a nan.
    sigma += 1e-22 
    background_term = tf.reduce_mean(tf.abs(x_hat - B0) / sigma, axis=-1)
    bce = binary_crossentropy(x, x_hat)
    loss = bce + background_term

    return loss

Run Code Online (Sandbox Code Playgroud)

当我尝试使用此损失函数最小化网络时，损失几乎立即变为 NaN。有谁知道为什么会发生这种情况？您可以通过克隆我的存储库并运行此脚本来重现该错误。

Answer 1

mdo*_*fe1 5

这是因为 tf.abs(x_hat - B0) 正在接近一个条目全为零的张量。这使得 sigma 对 x_hat 的导数成为 NaN。解决方案是在该数量上添加一个小值。

def ben_loss(x, x_hat):

    B0 = tf_median(tf.transpose(x_hat))
    F0 = tf.abs(x_hat - B0) + 1e-10
    sigma = tf.reduce_mean(tf.sqrt( / 0.4), axis=0)
    background_term = tf.reduce_mean(F0 / sigma, axis=-1)
    bce = binary_crossentropy(x, x_hat)
    loss = bce + background_term

    return loss

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，7 月前
查看次数：	4899 次
最近记录：	8 年，7 月前