如何在 bert 模型之上添加 Bi-LSTM 层?

Ala*_*ble 5 python neural-network deep-learning lstm pytorch

我正在使用pytorch,并使用基本的预训练 bert对仇恨言论的句子进行分类。我想实现一个Bi-LSTM层,它将 bert 模型中最新变压器编码器的所有输出作为新模型(实现nn.Module 的类)作为输入,但我对nn.LSTM参数感到困惑。我使用标记化了数据

bert = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=int(data['class'].nunique()),output_attentions=False,output_hidden_states=False)
Run Code Online (Sandbox Code Playgroud)

我的数据集有 2 列:类(标签)、句子。有人可以帮我弄这个吗?先感谢您。

编辑:此外,在处理 bi-lstm 中的输入后,网络将最终的隐藏状态发送到使用 softmax 激活函数执行分类的全连接网络。我怎样才能做到这一点 ?

Ash*_*'Sa 12

您可以按如下方式进行操作:

from transformers import BertModel
class CustomBERTModel(nn.Module):
    def __init__(self):
          super(CustomBERTModel, self).__init__()
          self.bert = BertModel.from_pretrained("bert-base-uncased")
          ### New layers:
          self.lstm = nn.LSTM(768, 256, batch_first=True,bidirectional=True)
          self.linear = nn.Linear(256*2, <number_of_classes>)
          

    def forward(self, ids, mask):
          sequence_output, pooled_output = self.bert(
               ids, 
               attention_mask=mask)

          # sequence_output has the following shape: (batch_size, sequence_length, 768)
          lstm_output, (h,c) = self.lstm(sequence_output) ## extract the 1st token's embeddings
          hidden = torch.cat((lstm_output[:,-1, :256],lstm_output[:,0, 256:]),dim=-1)
          linear_output = self.linear(hidden.view(-1,256*2)) ### assuming that you are only using the output of the last LSTM cell to perform classification

          return linear_output

tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
model = CustomBERTModel()

Run Code Online (Sandbox Code Playgroud)