尝试使用相同的数据集顺序评估多个 Transformer 模型,以检查哪个模型表现更好。
型号列表是这样的:
MODELS = [
('xlm-mlm-enfr-1024' ,"XLMModel"),
('distilbert-base-cased', "DistilBertModel"),
('bert-base-uncased' ,"BertModel"),
('roberta-base' ,"RobertaModel"),
("cardiffnlp/twitter-roberta-base-sentiment","RobertaSentTW"),
('xlnet-base-cased' ,"XLNetModel"),
#('ctrl' ,"CTRLModel"),
('transfo-xl-wt103' ,"TransfoXLModel"),
('bert-base-cased' ,"BertModelUncased"),
('xlm-roberta-base' ,"XLMRobertaModel"),
('openai-gpt' ,"OpenAIGPTModel"),
('gpt2' ,"GPT2Model")
Run Code Online (Sandbox Code Playgroud)
所有这些都工作正常'ctrl',直到模型返回此错误:
Asking to pad, but the tokenizer does not have a padding token. Please select a token to use as 'pad_token' '(tokenizer.pad_token = tokenizer.eos_token e.g.)' or add a new pad token via 'tokenizer.add_special_tokens({'pad_token': '[PAD]'})'.
对我的数据集的句子进行标记时。
标记化代码是
SEQ_LEN = MAX_LEN #(50)
for pretrained_weights, model_name in MODELS:
print("***************** INICIANDO " …Run Code Online (Sandbox Code Playgroud) 我使用变压器(BertForSequenceClassification)训练了一个序列分类模型,但出现错误:
\n预计所有张量都在同一设备上,但发现至少有两个设备,cpu 和 cuda:0!(在方法wrapper__index_select中检查参数索引时)
\n我真的不明白问题出在哪里,如果问题出在我的模型上,问题出在我如何标记数据上,或者是什么。
\n这是我的代码:
\n加载预训练模型
\nmodel_state_dict = torch.load("../MODELOS/TRANSFORMERS/TransformersNormal", map_location='cpu') #Doesnt work with map_location='cuda:0' neither\nmodel = BertForSequenceClassification.from_pretrained(pretrained_model_name_or_path="bert-base-uncased", state_dict=model_state_dict, cache_dir='./data')\nRun Code Online (Sandbox Code Playgroud)\n创建数据加载
\ndef crearDataLoad(dfv,tokenizer): \n\n dft=dfv # usamos el del validacion para que nos salga los resultados y no tener que cambiar mucho codigo\n\n #validation=dfv['text'] \n validation=dfv['text'].str.lower() # para modelos uncased # el fichero que hemos llamado test es usado en la red neuronal\n validation_labels=dfv['label']\n \n validation_inputs = crearinputs (validation,tokenizer)\n validation_masks= crearmask (validation_inputs)\n …Run Code Online (Sandbox Code Playgroud) 我一直在我的最终学位项目中使用 python 模块nudnet 。我正在使用 google colab 来运行它。
在过去的几个月里它工作正常,没有任何问题,直到昨天,当我尝试导入它时,发生了这个错误:
!pip install --upgrade nudenet
from nudenet import NudeClassifier
ImportError: cannot import name '_registerMatType' from 'cv2.cv2' (/usr/local/lib/python3.7/dist-packages/cv2/cv2.cpython-37m-x86_64-linux-gnu.so)
Run Code Online (Sandbox Code Playgroud)
我尝试通过将opencv-python-headless降级到以前的版本来解决此错误
!pip uninstall opencv-python-headless==4.5.5.62
!pip install opencv-python-headless==4.5.1.48
Run Code Online (Sandbox Code Playgroud)
但是,当我加载分类器时,会出现此错误:
classifier = NudeClassifier()
Downloading the checkpoint to /root/.NudeNet/classifier_model.onnx
MB| |# | 0 Elapsed Time: 0:00:00
Content-length not found, file size cannot be estimated.
Succefully Downloaded to: /root/.NudeNet/classifier_model.onnx
InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /root/.NudeNet/classifier_model.onnx failed:Protobuf parsing failed.
Run Code Online (Sandbox Code Playgroud)
我也尝试过降级nudenet模块的版本,但仍然不起作用。
先感谢您。