如何在pytorch中使用LSTM进行分类？

Question

如何在pytorch中使用LSTM进行分类？

我的代码如下：

class Mymodel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, num_layers, batch_size):
        super(Discriminator, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size
        self.num_layers = num_layers
        self.batch_size = batch_size

        self.lstm = nn.LSTM(input_size, hidden_size)
        self.proj = nn.Linear(hidden_size, output_size)
        self.hidden = self.init_hidden()


    def init_hidden(self):
        return (Variable(torch.zeros(self.num_layers, self.batch_size, self.hidden_size)),
                Variable(torch.zeros(self.num_layers, self.batch_size, self.hidden_size)))

    def forward(self, x):
        lstm_out, self.hidden = self.lstm(x, self.hidden)
        output = self.proj(lstm_out)
        result = F.sigmoid(output)
        return result

Run Code Online (Sandbox Code Playgroud)

我想使用LSTM将句子分类为好（1）或坏（0）。使用此代码，我得到的结果是time_step * batch_size * 1，而不是0或1。如何编辑代码以获得分类结果？

Answer 1

Dyl*_*n F 6

理论：

回想一下，LSTM为系列中的每个输入输出一个向量。您正在使用句子，句子是一系列单词（可能转换为索引，然后作为矢量嵌入）。LSTM PyTorch教程中的以下代码清楚地说明了我的意思（***是我的重点）：

lstm = nn.LSTM(3, 3)  # Input dim is 3, output dim is 3
inputs = [autograd.Variable(torch.randn((1, 3)))
          for _ in range(5)]  # make a sequence of length 5

# initialize the hidden state.
hidden = (autograd.Variable(torch.randn(1, 1, 3)),
          autograd.Variable(torch.randn((1, 1, 3))))
for i in inputs:
    # Step through the sequence one element at a time.
    # after each step, hidden contains the hidden state.
    out, hidden = lstm(i.view(1, 1, -1), hidden)

# alternatively, we can do the entire sequence all at once.
# the first value returned by LSTM is all of the hidden states throughout
# the sequence. the second is just the most recent hidden state
# *** (compare the last slice of "out" with "hidden" below, they are the same)
# The reason for this is that:
# "out" will give you access to all hidden states in the sequence
# "hidden" will allow you to continue the sequence and backpropagate,
# by passing it as an argument  to the lstm at a later time
# Add the extra 2nd dimension
inputs = torch.cat(inputs).view(len(inputs), 1, -1)
hidden = (autograd.Variable(torch.randn(1, 1, 3)), autograd.Variable(
torch.randn((1, 1, 3))))  # clean out hidden state
out, hidden = lstm(inputs, hidden)
print(out)
print(hidden)

Run Code Online (Sandbox Code Playgroud)

再一次：将“ out”的最后一部分与下面的“ hidden”进行比较，它们是相同的。为什么？好...

如果您熟悉LSTM，建议您在此时推荐PyTorch LSTM 文档。在输出部分下，注意h_t每t输出一次。

现在，如果您不习惯使用LSTM风格的方程式，请查看Chris Olah的LSTM博客文章。向下滚动至展开的网络图：

当您逐字逐句地输入句子（x_i-by- x_i+1）时，每个时间步都有输出。您要解释整个句子以对其进行分类。因此，您必须等到LSTM看到所有字眼。也就是说，您需要将句子中的单词数h_t放在哪里t。

码：

这是编码参考。我不会复制整个内容，而只是复制相关部分。魔术发生在self.hidden2label(lstm_out[-1])

class LSTMClassifier(nn.Module):

    def __init__(self, embedding_dim, hidden_dim, vocab_size, label_size, batch_size):
        ...
        self.word_embeddings = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim)
        self.hidden2label = nn.Linear(hidden_dim, label_size)
        self.hidden = self.init_hidden()

    def init_hidden(self):
        return (autograd.Variable(torch.zeros(1, self.batch_size, self.hidden_dim)),
                autograd.Variable(torch.zeros(1, self.batch_size, self.hidden_dim)))

    def forward(self, sentence):
        embeds = self.word_embeddings(sentence)
        x = embeds.view(len(sentence), self.batch_size , -1)
        lstm_out, self.hidden = self.lstm(x, self.hidden)
        y  = self.hidden2label(lstm_out[-1])
        log_probs = F.log_softmax(y)
        return log_probs

Run Code Online (Sandbox Code Playgroud)

Answer 2

Man*_*rya 0

作为最后一层，如果您像 MNIST 那样进行数字分类，则必须为您想要的任意多个类（即 10 个）设置一个线性层。对于您的情况，由于您正在进行是/否（1/0）分类，因此您有两个标签/类，因此线性层有两个类。我建议添加一个线性层

nn.Linear（feature_size_from_previous_layer，2）

然后使用交叉熵损失训练模型。

标准 = nn.CrossEntropyLoss()

优化器 = optim.SGD(net.parameters(), lr=0.001, 动量=0.9)

归档时间：	7 年，9 月前
查看次数：	8970 次
最近记录：	7 年，4 月前