Cpt*_*aas 25 python logging tensorflow huggingface-transformers
我正在将 Huggingface Trainer 与BertForSequenceClassification.from_pretrained("bert-base-uncased")模型一起使用。
简化后,它看起来像这样:
model = BertForSequenceClassification.from_pretrained("bert-base-uncased")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
training_args = TrainingArguments(
output_dir="bert_results",
num_train_epochs=3,
per_device_train_batch_size=8,
per_device_eval_batch_size=32,
warmup_steps=500,
weight_decay=0.01,
logging_dir="bert_results/logs",
logging_steps=10
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=val_dataset,
compute_metrics=compute_metrics
)
Run Code Online (Sandbox Code Playgroud)
日志包含每 10 步的损失,但我似乎无法找到训练的准确性。有谁知道如何获得准确性,例如通过更改记录器的详细程度?我似乎在网上找不到任何有关它的信息。
小智 12
您可以加载准确性指标并使其与您的compute_metrics函数配合使用。举个例子,它会是这样的:
from datasets import load_metric
metric = load_metric('accuracy')
def compute_metrics(eval_pred):
predictions, labels = eval_pred
predictions = np.argmax(predictions, axis=1)
return metric.compute(predictions=predictions, references=labels)
Run Code Online (Sandbox Code Playgroud)
该函数示例compute_metrics基于Hugging Face 的文本分类教程。它在我的测试中有效。
sid*_*491 12
我遇到了同样的问题,我通过添加一个自定义回调解决了这个问题,该回调在每个回调末尾调用带有 train_dataset 的评估()方法。
class CustomCallback(TrainerCallback):
def __init__(self, trainer) -> None:
super().__init__()
self._trainer = trainer
def on_epoch_end(self, args, state, control, **kwargs):
if control.should_evaluate:
control_copy = deepcopy(control)
self._trainer.evaluate(eval_dataset=self._trainer.train_dataset, metric_key_prefix="train")
return control_copy
trainer = Trainer(
model=model, # the instantiated Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_dataset, # training dataset
eval_dataset=valid_dataset, # evaluation dataset
compute_metrics=compute_metrics, # the callback that computes metrics of interest
tokenizer=tokenizer
)
trainer.add_callback(CustomCallback(trainer))
train = trainer.train()
Run Code Online (Sandbox Code Playgroud)
这给出了如下的训练指标:
{'train_loss': 0.7159061431884766, 'train_accuracy': 0.4, 'train_f1': 0.5714285714285715, 'train_runtime': 6.2973, 'train_samples_per_second': 2.382, 'train_steps_per_second': 0.159, 'epoch': 1.0}
{'eval_loss': 0.8529007434844971, 'eval_accuracy': 0.0, 'eval_f1': 0.0, 'eval_runtime': 2.0739, 'eval_samples_per_second': 0.964, 'eval_steps_per_second': 0.482, 'epoch': 1.0}
Run Code Online (Sandbox Code Playgroud)
获得训练准确性的另一种方法是扩展基本 Trainer 类并覆盖compute_loss() 方法,如下所示:
class CustomTrainer(Trainer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def compute_loss(self, model, inputs, return_outputs=False):
"""
How the loss is computed by Trainer. By default, all models return the loss in the first element.
Subclass and override for custom behavior.
"""
if self.label_smoother is not None and "labels" in inputs:
labels = inputs.pop("labels")
else:
labels = None
outputs = model(**inputs)
# code for calculating accuracy
if "labels" in inputs:
preds = outputs.logits.detach()
acc1 = accuracy_score(inputs.labels.reshape(1, len(inputs.labels))[0], preds.argmax(axis=1))
self.log({'accuracy_score': acc1})
acc = (
(preds.argmax(axis=-1) == inputs.labels.reshape(1, len(inputs.labels))[0])
.type(torch.float)
.mean()
.item()
)
self.log({"train_accuracy": acc})
# end code for calculating accuracy
# Save past state if it exists
# TODO: this needs to be fixed and made cleaner later.
if self.args.past_index >= 0:
self._past = outputs[self.args.past_index]
if labels is not None:
loss = self.label_smoother(outputs, labels)
else:
# We don't use .loss here since the model may return tuples instead of ModelOutput.
loss = outputs["loss"] if isinstance(outputs, dict) else outputs[0]
return (loss, outputs) if return_outputs else loss
Run Code Online (Sandbox Code Playgroud)
然后使用 CustomTrainer 代替训练器,如下所示:
trainer = CustomTrainer(
model=model, # the instantiated Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_dataset, # training dataset
eval_dataset=valid_dataset, # evaluation dataset
compute_metrics=compute_metrics, # the callback that computes metrics of interest
tokenizer=tokenizer
)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
18124 次 |
| 最近记录: |