Hugging Face 的 Transformers 库中的 Trainer 使用的损失函数是什么？

Question

Hugging Face 的 Transformers 库中的 Trainer 使用的损失函数是什么？

sta*_*kos 11 python nlp artificial-intelligence machine-learning huggingface-transformers

我正在尝试使用Hugging Face 的 Transformers 库中的Trainer 类来微调 BERT 模型。

在他们的文档中，他们提到可以通过重写compute_loss类中的方法来指定自定义损失函数。但是，如果我不进行方法覆盖并使用 Trainer 直接微调 BERT 模型以进行情感分类，那么使用的默认损失函数是什么？它是分类交叉熵吗？谢谢！

Answer 1

den*_*ger 18

这取决于！特别是考虑到您相对模糊的设置描述，不清楚将使用什么损失。但从头开始，我们首先检查一下类compute_loss()中的默认函数Trainer是什么样的。

如果您想亲自查看，可以在这里找到相应的功能（撰写本文时的当前版本是 4.17）。使用默认参数返回的实际损失取自模型的输出值：

loss = outputs["loss"] if isinstance(outputs, dict) else outputs[0]

这意味着模型本身（默认情况下）负责计算某种损失并将其返回到outputs.

接下来，我们可以研究 BERT 的实际模型定义（来源：此处，特别是查看将在情感分析任务中使用的模型（我假设模型BertForSequenceClassification为.

与定义损失函数相关的代码如下所示：

if labels is not None:
    if self.config.problem_type is None:
        if self.num_labels == 1:
            self.config.problem_type = "regression"
        elif self.num_labels > 1 and (labels.dtype == torch.long or labels.dtype == torch.int):
            self.config.problem_type = "single_label_classification"
        else:
            self.config.problem_type = "multi_label_classification"

    if self.config.problem_type == "regression":
        loss_fct = MSELoss()
        if self.num_labels == 1:
            loss = loss_fct(logits.squeeze(), labels.squeeze())
        else:
            loss = loss_fct(logits, labels)
    elif self.config.problem_type == "single_label_classification":
        loss_fct = CrossEntropyLoss()
        loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
    elif self.config.problem_type == "multi_label_classification":
        loss_fct = BCEWithLogitsLoss()
        loss = loss_fct(logits, labels)

Run Code Online (Sandbox Code Playgroud)

根据这些信息，您应该能够自己设置正确的损失函数（通过model.config.problem_type相应更改），或者至少能够根据任务的超参数（标签数量、标签数量）确定选择哪个损失分数等）

归档时间：	3 年，5 月前
查看次数：	17968 次
最近记录：	3 年，5 月前