标签: huggingface-transformers

Pytorch交叉熵输入维度

我正在尝试使用 Huggingface 的 BertModel 和 Pytorch 开发一个二元分类器。\n分类器模块是这样的：

\n\n

class SSTClassifierModel(nn.Module):\n\n  def __init__(self, num_classes = 2, hidden_size = 768):\n    super(SSTClassifierModel, self).__init__()\n    self.number_of_classes = num_classes\n    self.dropout = nn.Dropout(0.01)\n    self.hidden_size = hidden_size\n    self.bert = BertModel.from_pretrained(\'bert-base-uncased\')\n    self.classifier = nn.Linear(hidden_size, num_classes)\n\n  def forward(self, input_ids, att_masks,token_type_ids,  labels):\n    _, embedding = self.bert(input_ids, token_type_ids, att_masks)\n    output = self.classifier(self.dropout(embedding))\n    return output\n

Run Code Online (Sandbox Code Playgroud)\n\n

我训练模型的方式如下：

\n\n

loss_function = BCELoss()\nmodel.train()\nfor epoch in range(NO_OF_EPOCHS):\n  for step, batch in enumerate(train_dataloader):\n        input_ids = batch[0].to(device)\n        input_mask = batch[1].to(device)\n        token_type_ids = batch[2].to(device)\n        labels = batch[3].to(device)\n        # assuming …

Run Code Online (Sandbox Code Playgroud)

python cross-entropy pytorch python-3.7 huggingface-transformers

P.A*_*oor

lucky-day

0
推荐指数

1
解决办法

1843
查看次数

为什么我们在 Huggingface Transformers 的 BERT 预训练模型中需要 init_weight 函数？

在 Hugginface Transformers 的代码中，有很多微调模型具有init_weight. 例如（这里），init_weight最后有一个函数。

class BertForSequenceClassification(BertPreTrainedModel):
    def __init__(self, config):
        super().__init__(config)
        self.num_labels = config.num_labels

        self.bert = BertModel(config)
        self.dropout = nn.Dropout(config.hidden_dropout_prob)
        self.classifier = nn.Linear(config.hidden_size, config.num_labels)

        self.init_weights()

Run Code Online (Sandbox Code Playgroud)

据我所知，它将调用以下代码

def _init_weights(self, module):
    """ Initialize the weights """
    if isinstance(module, (nn.Linear, nn.Embedding)):
        # Slightly different from the TF version which uses truncated_normal for initialization
        # cf https://github.com/pytorch/pytorch/pull/5617
        module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)
    elif isinstance(module, BertLayerNorm):
        module.bias.data.zero_()
        module.weight.data.fill_(1.0)
    if isinstance(module, nn.Linear) and module.bias is not None:
        module.bias.data.zero_()

Run Code Online (Sandbox Code Playgroud)

我的问题是 如果我们正在加载预训练模型，为什么我们需要为每个模块初始化权重？

我想我一定在这里误解了一些东西。

python bert-language-model huggingface-transformers

All*_*n-J

lucky-day

0
推荐指数

1
解决办法

2817
查看次数

拥抱面变压器：encode_plus 中的截断策略

encode_plus在 Huggingface 的转换器库中，允许截断输入序列。有两个参数是相关的：truncation和max_length。我将成对的输入序列传递给encode_plus并且需要以“截断”的方式简单地截断输入序列，即，如果整个序列由两个输入组成text并且text_pair比max_length它长，则应该从右侧相应地截断。

似乎这两种截断策略都不允许这样做，而是longest_first从最长的序列中删除标记（可以是 text 或 text_pair，但不仅仅是从序列的右侧或末尾，例如，如果文本更长，则 text_pair ，这似乎会首先从文本中删除标记），only_first并only_second仅从第一个或第二个中删除标记（因此，也不仅仅是从末尾删除），并且do_not_truncate根本不会截断。还是我误解了这一点，实际上longest_first可能是我正在寻找的？

pytorch huggingface-transformers

ped*_*jjj

lucky-day

0
推荐指数

1
解决办法

3419
查看次数

分词器的batch_encode_plus方法存在问题

我在batch_encode_plus标记器的方法中遇到了一个奇怪的问题。我最近从 Transformer 版本 3.3.0 切换到 4.5.1。（我正在为 NER 创建数据束）。

我有两个句子需要编码，并且我有一个句子已经被标记化的情况，但是由于这两个句子的长度不同，所以我需要pad [PAD]较短的句子才能获得一批统一的长度。

下面是我使用 3.3.0 版本的 Transformer 所做的代码

from transformers import AutoTokenizer

pretrained_model_name = 'distilbert-base-cased'
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name, add_prefix_space=True)

sentences = ["He is an uninvited guest.", "The host of the party didn't sent him the invite."]

# here we have the complete sentences
encodings = tokenizer.batch_encode_plus(sentences, max_length=20, padding=True)
batch_token_ids, attention_masks = encodings["input_ids"], encodings["attention_mask"]
print(batch_token_ids[0])
print(tokenizer.convert_ids_to_tokens(batch_token_ids[0]))

# And the output
# [101, 1124, 1110, 1126, 8362, 1394, 5086, 1906, 3648, …

Run Code Online (Sandbox Code Playgroud)

python pytorch huggingface-transformers huggingface-tokenizers huggingface-datasets

Anu*_*rma

lucky-day

0
推荐指数

1
解决办法

6046
查看次数