无法加载带有变压器包的 SpanBert 模型

Question

无法加载带有变压器包的 SpanBert 模型

Big*_*igD 3 python bert-language-model huggingface-transformers

我有一些关于使用转换器包加载 SpanBert 的问题。

我从SpanBert GitHub Repo 和vocab.txtBert下载了预训练文件。这是我用于加载的代码：

model = BertModel.from_pretrained(config_file=config_file,
                                  pretrained_model_name_or_path=model_file,
                                  vocab_file=vocab_file)
model.to("cuda")

Run Code Online (Sandbox Code Playgroud)

在哪里

config_file -> config.json
model_file -> pytorch_model.bin
vocab_file -> vocab.txt

但我得到了UnicodeDecoderError上面的代码说'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

我还尝试使用此处提到的方法加载 SpanBert 。但它回来了OSError: file SpanBERT/spanbert-base-cased not found。

您对正确加载预训练模型有什么建议吗？任何建议都非常感谢。谢谢！

Answer 1

Zab*_*azi 6

Download the pre-trained weights from the Github page.

https://github.com/facebookresearch/SpanBERT

SpanBERT (base & cased): 12-layer, 768-hidden, 12-heads , 110M parameters

SpanBERT (large & cased): 24-layer, 1024-hidden, 16-heads, 340M parameters

Extract them to a folder, for example I extracted to spanbert_hf_base folder which contains a .bin file and a config.json file.
您可以使用AutoModel加载模型和简单的 bert 标记器。从他们的回购：

这些模型与 HuggingFace BERT 模型具有相同的格式，因此您可以轻松地将它们替换为我们的 SpanBET 模型。

import torch
from transformers import AutoModel
model = AutoModel.from_pretrained('spanbert_hf_base/') # the path to .bin and config.json

from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

b = torch.tensor(tokenizer.encode('hi this is me, mr. meeseeks', add_special_tokens=True, max_length = 512)).unsqueeze(0)

out = model(b)

Run Code Online (Sandbox Code Playgroud)

出去：

(tensor([[[-0.1204, -0.0806, -0.0168,  ..., -0.0599, -0.1932, -0.0967],
          [-0.0851, -0.0980,  0.0039,  ..., -0.0563, -0.1655, -0.0156],
          [-0.1111, -0.0318,  0.0141,  ..., -0.0518, -0.1068, -0.1271],
          [-0.0317, -0.0441, -0.0306,  ..., -0.1049, -0.1940, -0.1919],
          [-0.1200,  0.0277, -0.0372,  ..., -0.0930, -0.0627,  0.0143],
          [-0.1204, -0.0806, -0.0168,  ..., -0.0599, -0.1932, -0.0967]]],
        grad_fn=<NativeLayerNormBackward>),
 tensor([[-9.7530e-02,  1.6328e-01,  9.3202e-03,  1.1010e-01,  7.3047e-02,
          -1.7635e-01,  1.0046e-01, -1.4826e-02,  9.2583e-
         ............

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，6 月前
查看次数：	889 次
最近记录：	5 年，1 月前