小编Sal*_*can的帖子

无法从 AutoTokenizer.from_pretrained 加载 - TypeError:重复的文件名 (sentencepiece_model.proto)

我正在尝试从预训练模型加载 tokenizer 和 seq2seq 模型。

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("ozcangundes/mt5-small-turkish-summarization")

model = AutoModelForSeq2SeqLM.from_pretrained("ozcangundes/mt5-small-turkish-summarization")
Run Code Online (Sandbox Code Playgroud)

但我收到了这个错误。

File ~/.local/lib/python3.8/site-packages/google/protobuf/descriptor.py:1028, in FileDescriptor.__new__(cls, name, package, options, serialized_options, serialized_pb, dependencies, public_dependencies, syntax, pool, create_key)
   1026     raise RuntimeError('Please link in cpp generated lib for %s' % (name))
   1027 elif serialized_pb:
-> 1028   return _message.default_pool.AddSerializedFile(serialized_pb)
   1029 else:
   1030   return super(FileDescriptor, cls).__new__(cls)

    TypeError: Couldn't build proto file into descriptor pool: duplicate file name (sentencepiece_model.proto)
Run Code Online (Sandbox Code Playgroud)

我尝试更新或降级 protobuf 版本。但我无法修复

python nlp protocol-buffers huggingface

4
推荐指数
1
解决办法
2896
查看次数

"对于"循环,逐渐放慢速度

我写了一个代码来显示循环的进度.部分代码:

String instantBinary = "";
for (int i = 0; i < Text.length(); i++) {
    //Sometimes the text is too long                        

    if (Text.length() > 100) {
        if (Text.length() % (Text.length() / 100) == i % (Text.length() / 100)) {
            WTProgress = "Translate Progress.. %" + (i * 100 / Text.length());
            System.out.println(WTProgress); 
        }
    }

    switch("" + Text.charAt(i)) {
        case "1": 
            instantBinary += "0000000";
            break;
        case "2": 
            instantBinary += "0000001";
            break;
        case "3": 
            instantBinary += "0000010";
            break;
        case "4": 
            instantBinary += "0000011"; …
Run Code Online (Sandbox Code Playgroud)

java performance for-loop

1
推荐指数
2
解决办法
461
查看次数