小编fer*_*567的帖子

AutoTokenizer.from_pretrained 无法加载本地保存的预训练分词器 (PyTorch)

我是 PyTorch 的新手,最近我一直在尝试使用 Transformers。我正在使用 HuggingFace 提供的预训练分词器。
我成功下载并运行它们。但如果我尝试保存它们并再次加载,则会发生一些错误。
如果我用来 AutoTokenizer.from_pretrained下载分词器,那么它就可以工作。

[1]:    tokenizer = AutoTokenizer.from_pretrained('distilroberta-base')
        text = "Hello there"
        enc = tokenizer.encode_plus(text)
        enc.keys()

Out[1]: dict_keys(['input_ids', 'attention_mask'])
Run Code Online (Sandbox Code Playgroud)

但是,如果我使用保存它tokenizer.save_pretrained("distilroberta-tokenizer")并尝试在本地加载它,则会失败。

[2]:    tmp = AutoTokenizer.from_pretrained('distilroberta-tokenizer')


---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
    238                 resume_download=resume_download,
--> 239                 local_files_only=local_files_only,
    240             )

/opt/conda/lib/python3.7/site-packages/transformers/file_utils.py in cached_path(url_or_filename, cache_dir, force_download, proxies, resume_download, user_agent, extract_compressed_file, force_extract, local_files_only)
    266         # File, but it doesn't exist.
--> 267         raise EnvironmentError("file {} not found".format(url_or_filename))
    268     else:

OSError: file …
Run Code Online (Sandbox Code Playgroud)

python deep-learning pytorch huggingface-transformers huggingface-tokenizers

7
推荐指数
2
解决办法
3万
查看次数