Tensorflow 中的平衡准确度得分

Question

Tensorflow 中的平衡准确度得分

Mat*_*tti 6 metrics machine-learning neural-network conv-neural-network tensorflow

我正在为高度不平衡的分类问题实施 CNN，我想在 tensorflow 中实施 custum 指标以使用 Select Best Model 回调。具体来说，我想实现平衡准确度分数，即每个类的召回率的平均值（请参阅此处的sklearn 实现），有人知道怎么做吗？

Answer 1

Aar*_*ing 5

我遇到了同样的问题，所以我实现了一个基于 off 的自定义类SparseCategoricalAccuracy：

class BalancedSparseCategoricalAccuracy(keras.metrics.SparseCategoricalAccuracy):
    def __init__(self, name='balanced_sparse_categorical_accuracy', dtype=None):
        super().__init__(name, dtype=dtype)

    def update_state(self, y_true, y_pred, sample_weight=None):
        y_flat = y_true
        if y_true.shape.ndims == y_pred.shape.ndims:
            y_flat = tf.squeeze(y_flat, axis=[-1])
        y_true_int = tf.cast(y_flat, tf.int32)

        cls_counts = tf.math.bincount(y_true_int)
        cls_counts = tf.math.reciprocal_no_nan(tf.cast(cls_counts, self.dtype))
        weight = tf.gather(cls_counts, y_true_int)
        return super().update_state(y_true, y_pred, sample_weight=weight)

Run Code Online (Sandbox Code Playgroud)

这个想法是设置每个类的权重与其大小成反比。

这段代码从 Autograph 产生了一些警告，但我相信这些是 Autograph 的错误，并且该指标似乎工作正常。

Answer 2

小智 3

我可以想到三种方法来解决这种情况：-

1) 随机欠采样 - 在这种方法中，您可以从大多数类中随机删除样本。

2) 随机过采样 - 在这种方法中，您可以通过复制样本来增加样本。

3）加权交叉熵——也可以使用加权交叉熵，这样就可以补偿少数类的损失值。看这里

我个人尝试过方法2，它确实提高了我的准确性，但它可能因数据集而异

归档时间：	6 年，2 月前
查看次数：	2803 次
最近记录：	5 年，6 月前