收到警告:您可能应该在下游任务上训练此模型,以便能够将其用于预测和推理。加载finetune模型时

emm*_*mma 5 python python-3.x tensorflow pytorch huggingface-transformers

当从检查点目录的最后一层加载带有前向神经网络的 Bert 微调模型时,我收到此消息。

\n
 This IS expected if you are initializing FlaubertForSequenceClassification fr            om the checkpoint of a model trained on another task or with another architectu            re (e.g. initializing a BertForSequenceClassification model from a BertForPreTr            aining model).\n- This IS NOT expected if you are initializing FlaubertForSequenceClassificatio            n from the checkpoint of a model that you expect to be exactly identical (initi            alizing a BertForSequenceClassification model from a BertForSequenceClassificat            ion model).\nSome weights of FlaubertForSequenceClassification were not initialized from the             model checkpoint at /gpfswork/rech/kpf/umg16uw/results_hf/sm/checkpoint-10 and             are newly initialized: ['sequence_summary.summary.weight', 'sequence_summary.s            ummary.bias']\nYou should probably TRAIN this model on a down-stream task to be able to use it             for predictions and inference.\n\n\n
Run Code Online (Sandbox Code Playgroud)\n

实际上,该模型已经在一个巨大的数据集上进行了训练,我加载了它来对新数据集进行推理。

\n
\nmodel = XXXForSequenceClassification.from_pretrained(modelForClass, num_labels=3)\n\ntest_file = '/g/012.xml'\nmodelForClass = '/g/checkpoint-10'\n    \ntest = preprare_data(PRE_TRAINED_MODEL_NAME, test_file)\npred = predict(test, test_model)\n\n
Run Code Online (Sandbox Code Playgroud)\n
***** Running Prediction *****\n  Num examples = 5\n  Batch size = 8\n  0%|                                                    | 0/1 [00:00<?, ?it/s][[-0.0903191   0.18442413 -0.09337573]\n [-0.08772105  0.17791435 -0.10178708]\n [-0.0903393   0.18614864 -0.08101001]\n [-0.08786416  0.1888753  -0.08145989]\n [-0.06697702  0.1874733  -0.09423935]]\n100%|\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 1/1 [00:00<00:00,  9.89it/s]\n\nreal    0m36.431s\n
Run Code Online (Sandbox Code Playgroud)\n

小智 0

不确定这是否有帮助,但是在使用 HuggingFace 的 Transformers 库加载现有模型时,我遇到了同样的错误。我通过初始化正确的库修复了错误(即,当我应该使用 Pytorch 时,我使用了 Tensorflow),然后能够读取模型。我使用的模型是使用 Roberta 进行训练的。不过,我使用常规 Bert 模型更改了模型。我希望这对您有所帮助,或者可能为您指明正确的方向。如果可以的话能给我看看完整的代码吗?