使用领域文本预训练 BERT/RoBERTa 语言模型，预计需要多长时间？哪个更快？

我想使用域语料库（情感相关文本）预训练 BERT 和 RoBERTa MLM。使用50k~100k字需要多长时间。由于 RoBERTa 没有接受过预测下一个句子目标的训练，一个训练目标比 BERT 少，并且具有更大的小批量和学习率，我认为 RoBERTa 会快得多？

language-model bert-language-model huggingface-transformers

Cas*_*hao

2020 03-26

2
推荐指数

1
解决办法

1028
查看次数

Hugging Face BertForSequenceClassification 中有 6 个标签，而不是 2 个

我只是想知道是否可以将 HuggingFace BertForSequenceClassification模型扩展到 2 个以上的标签。文档说，我们可以传递位置参数，但似乎“标签”不起作用。有人有想法吗？

模型分配

labels = th.tensor([0,0,0,0,0,0], dtype=th.long).unsqueeze(0)
print(labels.shape)
modelBERTClass = transformers.BertForSequenceClassification.from_pretrained(
    'bert-base-uncased', 
    labels=labels
    )

l = [module for module in modelBERTClass.modules()]
l

Run Code Online (Sandbox Code Playgroud)

控制台输出

torch.Size([1, 6])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-122-fea9a36402a6> in <module>()
      3 modelBERTClass = transformers.BertForSequenceClassification.from_pretrained(
      4     'bert-base-uncased',
----> 5     labels=labels
      6     )
      7 

/usr/local/lib/python3.6/dist-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    653 
    654         # Instantiate model.
--> 655         model = cls(config, *model_args, **model_kwargs)
    656 
    657         if state_dict is None and not from_tf:

TypeError: __init__() …

Run Code Online (Sandbox Code Playgroud)

python transformer-model bert-language-model huggingface-transformers

Ale*_*lex

lucky-day

2
推荐指数

1
解决办法

6525
查看次数

使用 Hugging Face Transformers 库如何可以 POS_TAG 法语文本

我正在尝试使用 Hugging Face Transformers 库来 POS_TAG 法语。在英语中，我可以通过如下句子来做到这一点：

天气真是太好了。那么让我们去散步吧。

结果是：

    token   feature
0   The     DET
1   weather NOUN
2   is      AUX
3   really  ADV
4   great   ADJ
5   .       PUNCT
6   So      ADV
7   let     VERB
8   us      PRON
9   go      VERB
10  for     ADP
11  a       DET
12  walk    NOUN
13  .       PUNCT

Run Code Online (Sandbox Code Playgroud)

有谁知道法语如何实现类似的目标吗？

这是我在 Jupyter 笔记本中用于英文版本的代码：

!git clone https://github.com/bhoov/spacyface.git
!python -m spacy download en_core_web_sm

from transformers import pipeline
import numpy as np
import pandas as pd

nlp = pipeline('feature-extraction') …

Run Code Online (Sandbox Code Playgroud)

python nlp bert-language-model huggingface-transformers

gil*_*des

2020 07-08

2
推荐指数

1
解决办法

4398
查看次数

在我的序列分类模型的微调 BERT 上应用 LIME 解释？

我对 BERT 对于特定任务的序列分类进行了微调，我想应用 LIME 解释来查看每个标记如何有助于分类到特定标签，因为 LIME 将分类器处理为黑盒。我根据可用的在线代码制作了组合代码，如下所示：

# coding=utf-8\n# Copyright 2018 The Google AI Language Team Authors and The HugginFace Inc. team.\n# Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the "License");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on …

Run Code Online (Sandbox Code Playgroud)

python lime bert-language-model

Eli*_*iam

2020 10-23

2
推荐指数

1
解决办法

4046
查看次数

多头注意力：Q、K、V 线性变换的正确实现

我现在正在 Pytorch 中实现多头自注意力。我查看了几个实现，它们似乎有点错误，或者至少我不确定为什么要这样做。他们通常只会应用一次线性投影：

    self.query_projection = nn.Linear(input_dim, output_dim)
    self.key_projection = nn.Linear(input_dim, output_dim)
    self.value_projection = nn.Linear(input_dim, output_dim)

Run Code Online (Sandbox Code Playgroud)

然后他们经常将投影重塑为

    query_heads = query_projected.view(batch_size, query_lenght, head_count, head_dimension).transpose(1,2)
    key_heads = key_projected.view(batch_size, key_len, head_count, head_dimension).transpose(1, 2)  # (batch_size, heads_count, key_len, d_head)
    value_heads = value_projected.view(batch_size, value_len, head_count, head_dimension).transpose(1, 2)  # (batch_size, heads_count, value_len, d_head)

    attention_weights = scaled_dot_product(query_heads, key_heads)

Run Code Online (Sandbox Code Playgroud)

根据此代码，每个头将处理预计查询的一部分。然而，最初的论文说我们需要为编码器中的每个头有一个不同的线性投影。

这个显示的实现正确吗？

nlp neural-network attention-model pytorch bert-language-model

Ger*_*ens

lucky-day

2
推荐指数

1
解决办法

481
查看次数

zsh：未找到匹配项：bertopic[可视化]

我正在尝试使用以下命令在我的 macbook pro 中安装 bertopic[可视化]

pip3 install bertopic[visualization]

Run Code Online (Sandbox Code Playgroud)

但每当我运行上述命令时，我都会收到错误。错误如下：

zsh: no matches found: bertopic[visualization]

Run Code Online (Sandbox Code Playgroud)

有没有办法安装bert的可视化选项？

pip python-3.x bert-language-model

Nay*_*dhu

2021 01-08

2
推荐指数

1
解决办法

1774
查看次数

类型错误：线性（）：参数“输入”（位置 1）必须是张量，而不是 str

所以我一直在尝试研究我在 github 上发现的一些 bert 示例，这是我第一次尝试使用 bert 并查看它是如何工作的。使用的呼吸即时消息如下：https : //github.com/prateekjoshi565/Fine-Tuning-BERT/blob/master/Fine_Tuning_BERT_for_Spam_Classification.ipynb

我使用了不同的数据集，但是我遇到了问题 TypeError: linear(): argument 'input' (position 1) must be Tensor, not str" 老实说，我不知道我做错了什么。有没有人可以帮助我？

我一直在使用的代码如下：

# convert class weights to tensor
weights= torch.tensor(class_wts,dtype=torch.float)
weights = weights.to(device)

# loss function
cross_entropy  = nn.NLLLoss(weight=weights) 

# number of training epochs
epochs = 10

def train():
  
  model.train()

  total_loss, total_accuracy = 0, 0
  
  # empty list to save model predictions
  total_preds=[]
  
  # iterate over batches
  for step,batch in enumerate(train_dataloader):
    
    # progress update after every 50 batches. …

Run Code Online (Sandbox Code Playgroud)

python pytorch bert-language-model

and*_*rew

lucky-day

2
推荐指数

1
解决办法

2077
查看次数

tflite 转换器错误操作不支持

我试图将 albert 的 .pb 模型转换为 tflite

我在 tf 1.15 中使用https://github.com/google-research/albert制作了 .pb 模型

我曾经 tconverter = tf.compat.v1.lite.TFLiteConverter.from_saved_model(saved_model_dir) # path to the SavedModel directory 制作过 tflite 文件（在 tf 2.4.1 中）

但

Traceback (most recent call last):
  File "convert.py", line 7, in <module>
    tflite_model = converter.convert()
  File "/home/pgb/anaconda3/envs/test2/lib/python3.6/site-packages/tensorflow_core/lite/python/lite.py", line 983, in convert
    **converter_kwargs)
  File "/home/pgb/anaconda3/envs/test2/lib/python3.6/site-packages/tensorflow_core/lite/python/convert.py", line 449, in toco_convert_impl
    enable_mlir_converter=enable_mlir_converter)
  File "/home/pgb/anaconda3/envs/test2/lib/python3.6/site-packages/tensorflow_core/lite/python/convert.py", line 200, in toco_convert_protos
    raise ConverterError("See console for info.\n%s\n%s\n" % (stdout, stderr))
tensorflow.lite.python.convert.ConverterError: See console for info.
2021-04-25 17:30:33.543663: …

Run Code Online (Sandbox Code Playgroud)

adb tensorflow tensorflow-lite bert-language-model

Mid*_*ang

lucky-day

2
推荐指数

1
解决办法

7032
查看次数

在特定领域继续训练预训练 BERT 模型的最简单方法是什么？

我想使用预先训练的 BERT 模型，以便将其用于文本分类任务（我正在使用 Huggingface 库）。然而，预训练模型是在与我的不同的领域进行训练的，并且我有一个大型未注释的数据集，可用于对其进行微调。如果我仅使用标记的示例并在特定任务（BertForSequenceClassification）训练时“随时”对其进行微调，则数据集太小，无法适应特定领域的语言模型。这样做的最好方法是什么？谢谢！

nlp text-classification bert-language-model huggingface-transformers pytorch-lightning

Ori*_*rit

lucky-day

2
推荐指数

1
解决办法

1959
查看次数

在输出和目标标签之间使用 nn.Cross entropy

我用这个代码

训练模型的函数

def train():
  
  model.train()

  total_loss, total_accuracy = 0, 0
  
  # empty list to save model predictions
  total_preds=[]
  
  # iterate over batches
  for step,batch in enumerate(train_dataloader):
    
    # progress update after every 50 batches.
    if step % 50 == 0 and not step == 0:
      print('  Batch {:>5,}  of  {:>5,}.'.format(step, len(train_dataloader)))

    # push the batch to gpu
    #batch = [r for r in batch]
 
    sent_id, mask, labels = batch['input_ids'],batch['attention_mask'],batch['labels']
    print(6)
    print(sent_id)
    print(mask)
    print(labels)
    print(batch['input_ids'].shape)
    print(batch['attention_mask'].shape)
    print(batch['labels'].shape)

    # clear previously calculated gradients …

Run Code Online (Sandbox Code Playgroud)

python neural-network torch cross-entropy bert-language-model

Sho*_*del

lucky-day

2
推荐指数

1
解决办法

5877
查看次数