为什么培训师在教程中培训时不报告评估指标?

mar*_*lon 5 python transformer-model huggingface-transformers

我正在按照本教程来了解训练器 API。\n https://huggingface.co/transformers/training.html

\n

我复制了如下代码:

\n
from datasets import load_dataset\n\nimport numpy as np\nfrom datasets import load_metric\n\nmetric = load_metric("accuracy")\n\ndef compute_metrics(eval_pred):\n    logits, labels = eval_pred\n    predictions = np.argmax(logits, axis=-1)\n    return metric.compute(predictions=predictions, references=labels)\n\nprint(\'Download dataset ...\')\nraw_datasets = load_dataset("imdb")\nfrom transformers import AutoTokenizer\n\nprint(\'Tokenize text ...\')\ntokenizer = AutoTokenizer.from_pretrained("bert-base-cased")\ndef tokenize_function(examples):\n    return tokenizer(examples["text"], padding="max_length", truncation=True)\ntokenized_datasets = raw_datasets.map(tokenize_function, batched=True)\n\nprint(\'Prepare data ...\')\nsmall_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(500))\nsmall_eval_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(500))\nfull_train_dataset = tokenized_datasets["train"]\nfull_eval_dataset = tokenized_datasets["test"]\n\nprint(\'Define model ...\')\nfrom transformers import AutoModelForSequenceClassification\nmodel = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=2)\n\nprint(\'Define trainer ...\')\nfrom transformers import TrainingArguments, Trainer\ntraining_args = TrainingArguments("test_trainer", evaluation_strategy="epoch")\ntrainer = Trainer(\n    model=model,\n    args=training_args,\n    train_dataset=small_train_dataset,\n    eval_dataset=small_eval_dataset,\n    compute_metrics=compute_metrics,\n)\n\nprint(\'Fine-tune train ...\')\ntrainer.evaluate()\n
Run Code Online (Sandbox Code Playgroud)\n

但是,它不会报告任何有关训练指标的信息,而是报告以下消息:

\n
Download dataset ...\nReusing dataset imdb (/Users/congminmin/.cache/huggingface/datasets/imdb/plain_text/1.0.0/4ea52f2e58a08dbc12c2bd52d0d92b30b88c00230b4522801b3636782f625c5b)\nTokenize text ...\n100%|\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 25/25 [00:06<00:00,  4.01ba/s]\n100%|\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 25/25 [00:06<00:00,  3.99ba/s]\n100%|\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 50/50 [00:13<00:00,  3.73ba/s]\nPrepare data ...\nDefine model ...\nSome weights of the model checkpoint at bert-base-cased were not used when initializing BertForSequenceClassification: [\'cls.seq_relationship.weight\', \'cls.predictions.transform.LayerNorm.weight\', \'cls.seq_relationship.bias\', \'cls.predictions.transform.dense.bias\', \'cls.predictions.bias\', \'cls.predictions.decoder.weight\', \'cls.predictions.transform.LayerNorm.bias\', \'cls.predictions.transform.dense.weight\']\n- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\nSome weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: [\'classifier.weight\', \'classifier.bias\']\nYou should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\nDefine trainer ...\nFine-tune train ...\n100%|\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 63/63 [08:35<00:00,  8.19s/it]\n\nProcess finished with exit code 0\n
Run Code Online (Sandbox Code Playgroud)\n

教程还没更新吗?我应该进行一些配置更改来报告指标吗?

\n

小智 1

评估函数返回指标,但不会打印它们。做

metrics=trainer.evaluate()
print(metrics)
Run Code Online (Sandbox Code Playgroud)

工作?另外,该消息表明您正在使用基本 bert 模型,该模型不是针对句子分类进行预训练的,而是基本语言模型。因此,它没有任务的初始化权重,应该进行训练