为什么在Keras指标函数中使用axis = -1？

Question

为什么在Keras指标函数中使用axis = -1？

spi*_*der 6 deep-learning keras tensorflow

keras版本：2.0.8

在某些Keras度量函数和损失函数中，使用axis = -1作为参数。

例如？

def binary_accuracy(y_true, y_pred):
    return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)

Run Code Online (Sandbox Code Playgroud)

就我而言：

y_true的形状：（4,256,256,2）

y_pred的形状：（4,256,256,2）

因此，binary_accuracy（y_true，y_pred）应该返回shape =（4,256,256）的张量，而不是标量张量。

但是，当使用binary_accuracy作为度量函数时：

model.compile(optimizer=adam, loss=keras.losses.binary_crossentropy, metrics=[binary_accuracy])

Run Code Online (Sandbox Code Playgroud)

该日志仍然将binary_accuracy打印为标量，这让我很困惑。

keras在binary_accuracy函数的返回上是否有一些特殊之处？

时代11/300

0s-损失：0.4158-二进制精度：0.9308-val损失：0.4671-val_binary_accuracy：0.7767

Answer 1

Yu-*_*ang 1

这是您在training_utils.py中寻找的内容：

def weighted(y_true, y_pred, weights, mask=None):
    """Wrapper function.
    # Arguments
        y_true: `y_true` argument of `fn`.
        y_pred: `y_pred` argument of `fn`.
        weights: Weights tensor.
        mask: Mask tensor.
    # Returns
        Scalar tensor.
    """
    # score_array has ndim >= 2
    score_array = fn(y_true, y_pred)
    if mask is not None:
        # Cast the mask to floatX to avoid float64 upcasting in Theano
        mask = K.cast(mask, K.floatx())
        # mask should have the same shape as score_array
        score_array *= mask
        #  the loss per batch should be proportional
        #  to the number of unmasked samples.
        score_array /= K.mean(mask) + K.epsilon()

    # apply sample weighting
    if weights is not None:
        # reduce score_array to same ndim as weight array
        ndim = K.ndim(score_array)
        weight_ndim = K.ndim(weights)
        score_array = K.mean(score_array,
                             axis=list(range(weight_ndim, ndim)))
        score_array *= weights
        score_array /= K.mean(K.cast(K.not_equal(weights, 0), K.floatx()))
    return K.mean(score_array)
return weighted

Run Code Online (Sandbox Code Playgroud)

度量函数由score_array = fn(y_true, y_pred)（它是一个嵌套函数并且fn在外部函数中定义）调用。该数组在最后一行求平均值return K.mean(score_array)。这就是为什么您看到的是标量指标而不是张量。之间的线只是为了在必要时引入蒙版和权重。

这个回复解释了为什么可以只取最后一个轴的平均值，但我仍然不确定为什么设计是这样的。听起来无论如何都是稍后取均值的，那么为什么要在损失函数中取 axis=-1 的均值呢？在损失函数定义中不采取任何手段不是更有效吗？ (4认同)

归档时间：	8 年，5 月前
查看次数：	1590 次
最近记录：	6 年，5 月前