SMM*_*iSP 6 python nlp machine-learning deep-learning huggingface-transformers
我有一个函数,可以从 Huggingface 加载预训练模型并对其进行微调以进行情感分析,然后计算 F1 分数并返回结果。问题是,当我使用完全相同的参数多次调用此函数时,它将给出与预期完全相同的度量分数,除了第一次不同之外,这怎么可能?
这是我的函数,是根据huggingface中的本教程编写的:
import uuid
import numpy as np
from datasets import (
load_dataset,
load_metric,
DatasetDict,
concatenate_datasets
)
from transformers import (
AutoTokenizer,
AutoModelForSequenceClassification,
DataCollatorWithPadding,
TrainingArguments,
Trainer,
)
CHECKPOINT = "distilbert-base-uncased"
SAVING_FOLDER = "sst2"
def custom_train(datasets, checkpoint=CHECKPOINT, saving_folder=SAVING_FOLDER):
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
def tokenize_function(example):
return tokenizer(example["sentence"], truncation=True)
tokenized_datasets = datasets.map(tokenize_function, batched=True)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
saving_folder = f"{SAVING_FOLDER}_{str(uuid.uuid1())}"
training_args = TrainingArguments(saving_folder)
trainer = Trainer(
model,
training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["validation"],
data_collator=data_collator,
tokenizer=tokenizer,
)
trainer.train()
predictions = trainer.predict(tokenized_datasets["test"])
print(predictions.predictions.shape, predictions.label_ids.shape)
preds = np.argmax(predictions.predictions, axis=-1)
metric_fun = load_metric("f1")
metric_result = metric_fun.compute(predictions=preds, references=predictions.label_ids)
return metric_result
Run Code Online (Sandbox Code Playgroud)
然后我将使用相同的数据集多次运行此函数,并每次附加返回的 F1 分数的结果:
raw_datasets = load_dataset("glue", "sst2")
small_datasets = DatasetDict({
"train": raw_datasets["train"].select(range(100)).flatten_indices(),
"validation": raw_datasets["validation"].select(range(100)).flatten_indices(),
"test": raw_datasets["validation"].select(range(100, 200)).flatten_indices(),
})
results = []
for i in range(4):
result = custom_train(small_datasets)
results.append(result)
Run Code Online (Sandbox Code Playgroud)
然后当我检查结果列表时:
[{'f1': 0.7755102040816325}, {'f1': 0.5797101449275361}, {'f1': 0.5797101449275361}, {'f1': 0.5797101449275361}]
Run Code Online (Sandbox Code Playgroud)
可能会想到的是,当我加载预先训练的模型时,头部将使用随机权重进行初始化,这就是结果不同的原因,如果是这种情况,为什么只有第一个不同而其他完全一样吗?
SMM*_*iSP 12
Sylvain Gugger在这里回答了这个问题:https://discuss.huggingface.co/t/multiple-training-will-give-exactly-the-same-result- except-for-the-first-time/8493
\n\n\n您需要在实例化模型之前设置种子,否则随机头不会以相同的方式初始化,这\xe2\x80\x99s为什么第一次运行总是不同。\n后续运行都是相同的,因为种子有已由 Trainer 在 train 方法中设置。\n要设置种子:
\n
from transformers import set_seed\n\nset_seed(42)\nRun Code Online (Sandbox Code Playgroud)\n
| 归档时间: |
|
| 查看次数: |
4178 次 |
| 最近记录: |