PyTorch中是否存在干净且可扩展的LSTM实现?

Gui*_*ier 10 implementation open-source coding-style lstm pytorch

我想自己创建一个LSTM类,但是,我不想再从头开始重写经典的LSTM函数.

挖掘PyTorch的代码,我只找到一个涉及至少3-4个继承类的脏实现:https : //github.com/pytorch/pytorch/blob/98c24fae6b6400a7d1e13610b20aa05f86f77070/torch/nn/modules/rnn.py#L323 https://github.com/pytorch/pytorch/blob/98c24fae6b6400a7d1e13610b20aa05f86f77070/torch/nn/modules/rnn.py#L12 https://github.com/pytorch/pytorch/blob/98c24fae6b6400a7d1e13610b20aa05f86f77070/torch/nn/_functions/rnn的.py#L297

某个LSTM的干净PyTorch实现是否存在?任何链接都会有帮助.例如,我知道TensorFlow中存在LSTM的干净实现,但我需要派生一个PyTorch实现.

一个明显的例子,我正在寻找的是一个如下所示的实现,但是在PyTorch中:https: //github.com/hardmaru/supercell/blob/063b01e75e6e8af5aeb0aac5cc583948f5887dd1/supercell.py#L143

Ric*_*ard 12

我找到的最佳实现是
https://github.com/pytorch/benchmark/blob/master/benchmarks/lstm_variants/lstm.py

它甚至实现了四种不同的循环丢失变体,这非常有用!
如果你把丢失部分带走了

import math
import torch as th
import torch.nn as nn

class LSTM(nn.Module):

    def __init__(self, input_size, hidden_size, bias=True):
        super(LSTM, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.bias = bias
        self.i2h = nn.Linear(input_size, 4 * hidden_size, bias=bias)
        self.h2h = nn.Linear(hidden_size, 4 * hidden_size, bias=bias)
        self.reset_parameters()

    def reset_parameters(self):
        std = 1.0 / math.sqrt(self.hidden_size)
        for w in self.parameters():
            w.data.uniform_(-std, std)

    def forward(self, x, hidden):
        h, c = hidden
        h = h.view(h.size(1), -1)
        c = c.view(c.size(1), -1)
        x = x.view(x.size(1), -1)

        # Linear mappings
        preact = self.i2h(x) + self.h2h(h)

        # activations
        gates = preact[:, :3 * self.hidden_size].sigmoid()
        g_t = preact[:, 3 * self.hidden_size:].tanh()
        i_t = gates[:, :self.hidden_size]
        f_t = gates[:, self.hidden_size:2 * self.hidden_size]
        o_t = gates[:, -self.hidden_size:]

        c_t = th.mul(c, f_t) + th.mul(i_t, g_t)

        h_t = th.mul(o_t, c_t.tanh())

        h_t = h_t.view(1, h_t.size(0), -1)
        c_t = c_t.view(1, c_t.size(0), -1)
        return h_t, (h_t, c_t)
Run Code Online (Sandbox Code Playgroud)

PS:存储库包含更多LSTM和其他RNN的变体:
https://github.com/pytorch/benchmark/tree/master/benchmarks.
看看吧,也许你想到的扩展已经存在!

编辑:
如评论中所述,您可以将LSTM单元格包装在上面以处理顺序输出:

import math
import torch as th
import torch.nn as nn


class LSTMCell(nn.Module):

    def __init__(self, input_size, hidden_size, bias=True):
        # As before

    def reset_parameters(self):
        # As before

    def forward(self, x, hidden):

        if hidden is None:
            hidden = self._init_hidden(x)

        # Rest as before

    @staticmethod
    def _init_hidden(input_):
        h = th.zeros_like(input_.view(1, input_.size(1), -1))
        c = th.zeros_like(input_.view(1, input_.size(1), -1))
        return h, c


class LSTM(nn.Module):

    def __init__(self, input_size, hidden_size, bias=True):
        super().__init__()
        self.lstm_cell = LSTMCell(input_size, hidden_size, bias)

    def forward(self, input_, hidden=None):
        # input_ is of dimensionalty (1, time, input_size, ...)

        outputs = []
        for x in torch.unbind(input_, dim=1):
            hidden = self.lstm_cell(x, hidden)
            outputs.append(hidden[0].clone())

        return torch.stack(outputs, dim=1)
Run Code Online (Sandbox Code Playgroud)

我没有测试代码,因为我正在使用convLSTM实现.如果出现问题,请告诉我.

  • 我上面给出的代码通常被称为LSTM单元.为了处理顺序输入,只需将其包装在一个模块中,该模块设置初始隐藏状态,然后迭代输入的时间维度,在每个时间点调用LSTM单元格(类似于在此处完成的方式https:// discuss.pytorch.org/t/implementation-of-multiplicative-lstm/2328/9) (2认同)
  • 我现在编辑了我的答案,举例说明如何将LSTM单元封装在一个完整的LSTM模块中.这样,您还可以轻松扩展多层实现的代码(在LSTM中向前迭代几个LSTM单元). (2认同)