我想使用域语料库(情感相关文本)预训练 BERT 和 RoBERTa MLM。使用50k~100k字需要多长时间。由于 RoBERTa 没有接受过预测下一个句子目标的训练,一个训练目标比 BERT 少,并且具有更大的小批量和学习率,我认为 RoBERTa 会快得多?
我只是想知道是否可以将 HuggingFace BertForSequenceClassification模型扩展到 2 个以上的标签。文档说,我们可以传递位置参数,但似乎“标签”不起作用。有人有想法吗?
labels = th.tensor([0,0,0,0,0,0], dtype=th.long).unsqueeze(0)
print(labels.shape)
modelBERTClass = transformers.BertForSequenceClassification.from_pretrained(
'bert-base-uncased',
labels=labels
)
l = [module for module in modelBERTClass.modules()]
l
Run Code Online (Sandbox Code Playgroud)
torch.Size([1, 6])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-122-fea9a36402a6> in <module>()
3 modelBERTClass = transformers.BertForSequenceClassification.from_pretrained(
4 'bert-base-uncased',
----> 5 labels=labels
6 )
7
/usr/local/lib/python3.6/dist-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
653
654 # Instantiate model.
--> 655 model = cls(config, *model_args, **model_kwargs)
656
657 if state_dict is None and not from_tf:
TypeError: __init__() …Run Code Online (Sandbox Code Playgroud) python transformer-model bert-language-model huggingface-transformers
我正在尝试使用 Hugging Face Transformers 库来 POS_TAG 法语。在英语中,我可以通过如下句子来做到这一点:
天气真是太好了。那么让我们去散步吧。
结果是:
token feature
0 The DET
1 weather NOUN
2 is AUX
3 really ADV
4 great ADJ
5 . PUNCT
6 So ADV
7 let VERB
8 us PRON
9 go VERB
10 for ADP
11 a DET
12 walk NOUN
13 . PUNCT
Run Code Online (Sandbox Code Playgroud)
有谁知道法语如何实现类似的目标吗?
这是我在 Jupyter 笔记本中用于英文版本的代码:
!git clone https://github.com/bhoov/spacyface.git
!python -m spacy download en_core_web_sm
from transformers import pipeline
import numpy as np
import pandas as pd
nlp = pipeline('feature-extraction') …Run Code Online (Sandbox Code Playgroud) 我对 BERT 对于特定任务的序列分类进行了微调,我想应用 LIME 解释来查看每个标记如何有助于分类到特定标签,因为 LIME 将分类器处理为黑盒。我根据可用的在线代码制作了组合代码,如下所示:
\n# coding=utf-8\n# Copyright 2018 The Google AI Language Team Authors and The HugginFace Inc. team.\n# Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the "License");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on …Run Code Online (Sandbox Code Playgroud) 我现在正在 Pytorch 中实现多头自注意力。我查看了几个实现,它们似乎有点错误,或者至少我不确定为什么要这样做。他们通常只会应用一次线性投影:
self.query_projection = nn.Linear(input_dim, output_dim)
self.key_projection = nn.Linear(input_dim, output_dim)
self.value_projection = nn.Linear(input_dim, output_dim)
Run Code Online (Sandbox Code Playgroud)
然后他们经常将投影重塑为
query_heads = query_projected.view(batch_size, query_lenght, head_count, head_dimension).transpose(1,2)
key_heads = key_projected.view(batch_size, key_len, head_count, head_dimension).transpose(1, 2) # (batch_size, heads_count, key_len, d_head)
value_heads = value_projected.view(batch_size, value_len, head_count, head_dimension).transpose(1, 2) # (batch_size, heads_count, value_len, d_head)
attention_weights = scaled_dot_product(query_heads, key_heads)
Run Code Online (Sandbox Code Playgroud)
根据此代码,每个头将处理预计查询的一部分。然而,最初的论文说我们需要为编码器中的每个头有一个不同的线性投影。
这个显示的实现正确吗?
nlp neural-network attention-model pytorch bert-language-model
我正在尝试使用以下命令在我的 macbook pro 中安装 bertopic[可视化]
pip3 install bertopic[visualization]
Run Code Online (Sandbox Code Playgroud)
但每当我运行上述命令时,我都会收到错误。错误如下:
zsh: no matches found: bertopic[visualization]
Run Code Online (Sandbox Code Playgroud)
有没有办法安装bert的可视化选项?
所以我一直在尝试研究我在 github 上发现的一些 bert 示例,这是我第一次尝试使用 bert 并查看它是如何工作的。使用的呼吸即时消息如下:https : //github.com/prateekjoshi565/Fine-Tuning-BERT/blob/master/Fine_Tuning_BERT_for_Spam_Classification.ipynb
我使用了不同的数据集,但是我遇到了问题 TypeError: linear(): argument 'input' (position 1) must be Tensor, not str" 老实说,我不知道我做错了什么。有没有人可以帮助我?
我一直在使用的代码如下:
# convert class weights to tensor
weights= torch.tensor(class_wts,dtype=torch.float)
weights = weights.to(device)
# loss function
cross_entropy = nn.NLLLoss(weight=weights)
# number of training epochs
epochs = 10
def train():
model.train()
total_loss, total_accuracy = 0, 0
# empty list to save model predictions
total_preds=[]
# iterate over batches
for step,batch in enumerate(train_dataloader):
# progress update after every 50 batches. …Run Code Online (Sandbox Code Playgroud) 我试图将 albert 的 .pb 模型转换为 tflite
我在 tf 1.15 中使用https://github.com/google-research/albert制作了 .pb 模型
我曾经
tconverter = tf.compat.v1.lite.TFLiteConverter.from_saved_model(saved_model_dir) # path to the SavedModel directory
制作过 tflite 文件(在 tf 2.4.1 中)
但
Traceback (most recent call last):
File "convert.py", line 7, in <module>
tflite_model = converter.convert()
File "/home/pgb/anaconda3/envs/test2/lib/python3.6/site-packages/tensorflow_core/lite/python/lite.py", line 983, in convert
**converter_kwargs)
File "/home/pgb/anaconda3/envs/test2/lib/python3.6/site-packages/tensorflow_core/lite/python/convert.py", line 449, in toco_convert_impl
enable_mlir_converter=enable_mlir_converter)
File "/home/pgb/anaconda3/envs/test2/lib/python3.6/site-packages/tensorflow_core/lite/python/convert.py", line 200, in toco_convert_protos
raise ConverterError("See console for info.\n%s\n%s\n" % (stdout, stderr))
tensorflow.lite.python.convert.ConverterError: See console for info.
2021-04-25 17:30:33.543663: …Run Code Online (Sandbox Code Playgroud) 我想使用预先训练的 BERT 模型,以便将其用于文本分类任务(我正在使用 Huggingface 库)。然而,预训练模型是在与我的不同的领域进行训练的,并且我有一个大型未注释的数据集,可用于对其进行微调。如果我仅使用标记的示例并在特定任务(BertForSequenceClassification)训练时“随时”对其进行微调,则数据集太小,无法适应特定领域的语言模型。这样做的最好方法是什么?谢谢!
nlp text-classification bert-language-model huggingface-transformers pytorch-lightning
我用这个代码
def train():
model.train()
total_loss, total_accuracy = 0, 0
# empty list to save model predictions
total_preds=[]
# iterate over batches
for step,batch in enumerate(train_dataloader):
# progress update after every 50 batches.
if step % 50 == 0 and not step == 0:
print(' Batch {:>5,} of {:>5,}.'.format(step, len(train_dataloader)))
# push the batch to gpu
#batch = [r for r in batch]
sent_id, mask, labels = batch['input_ids'],batch['attention_mask'],batch['labels']
print(6)
print(sent_id)
print(mask)
print(labels)
print(batch['input_ids'].shape)
print(batch['attention_mask'].shape)
print(batch['labels'].shape)
# clear previously calculated gradients …Run Code Online (Sandbox Code Playgroud) python neural-network torch cross-entropy bert-language-model