scikit加权F1分数的计算和使用

Question

scikit加权F1分数的计算和使用

com*_*com 2 nlp machine-learning scikit-learn precision-recall

我对weightedsklearn.metrics.f1_score中的平均值有疑问

sklearn.metrics.f1_score(y_true, y_pred, labels=None, pos_label=1, average='weighted', sample_weight=None)

Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

Run Code Online (Sandbox Code Playgroud)

首先，如果有任何引用证明使用weighted-F1是合理的，那么我只是好奇心，在这种情况下，我应该使用weighted-F1。

其次，我听说不赞成使用加权F1，这是真的吗？

第三，例如，如何实际计算加权F1

{
    "0": {
        "TP": 2,
        "FP": 1,
        "FN": 0,
        "F1": 0.8
    },
    "1": {
        "TP": 0,
        "FP": 2,
        "FN": 2,
        "F1": -1
    },
    "2": {
        "TP": 1,
        "FP": 1,
        "FN": 2,
        "F1": 0.4
    }
}

Run Code Online (Sandbox Code Playgroud)

上面的例子如何计算加权F1。我虽然应该是（0.8 * 2/3 + 0.4 * 1/3）/ 3，但是我错了。

Answer 1

jak*_*vdp 6

首先，如果有任何引用证明使用weighted-F1是合理的，那么我只是好奇心，在这种情况下，我应该使用weighted-F1。

我没有任何参考，但是如果您对多标签分类感兴趣，并且您关心所有类的精度/召回率，那么加权f1-score是合适的。如果您只关心阳性样本而采用二进制分类，则可能不合适。

其次，我听说不赞成使用加权F1，这是真的吗？

不，加权F1本身不被弃用。在v0.16中，仅弃用了功能接口的某些方面，然后仅在以前含糊不清的情况下使其更加明确。（有关github的历史性讨论，或查看源代码并在页面上搜索“已弃用”以查找详细信息。）

第三，如何实际计算加权F1？

从以下文档中f1_score：

``'weighted'``:
  Calculate metrics for each label, and find their average, weighted
  by support (the number of true instances for each label). This
  alters 'macro' to account for label imbalance; it can result in an
  F-score that is not between precision and recall.

Run Code Online (Sandbox Code Playgroud)

因此，平均值由支持物加权，即具有给定标签的样本数。由于上面的示例数据不包括支持，因此无法从您列出的信息中计算加权的f1分数。

归档时间：	9 年，11 月前
查看次数：	3687 次
最近记录：	6 年，7 月前