Abh*_*tia 29 neural-network lstm pytorch rnn
import torch,ipdb
import torch.autograd as autograd
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable
rnn = nn.LSTM(input_size=10, hidden_size=20, num_layers=2)
input = Variable(torch.randn(5, 3, 10))
h0 = Variable(torch.randn(2, 3, 20))
c0 = Variable(torch.randn(2, 3, 20))
output, hn = rnn(input, (h0, c0))
Run Code Online (Sandbox Code Playgroud)
这是文档中的LSTM示例.我不明白以下事项:
编辑:
import torch,ipdb
import torch.autograd as autograd
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable
import torch.nn.functional as F
num_layers=3
num_hyperparams=4
batch = 1
hidden_size = 20
rnn = nn.LSTM(input_size=num_hyperparams, hidden_size=hidden_size, num_layers=num_layers)
input = Variable(torch.randn(1, batch, num_hyperparams)) # (seq_len, batch, input_size)
h0 = Variable(torch.randn(num_layers, batch, hidden_size)) # (num_layers, batch, hidden_size)
c0 = Variable(torch.randn(num_layers, batch, hidden_size))
output, hn = rnn(input, (h0, c0))
affine1 = nn.Linear(hidden_size, num_hyperparams)
ipdb.set_trace()
print output.size()
print h0.size()
Run Code Online (Sandbox Code Playgroud)
***RuntimeError:预期的矩阵,得到3D,2D张量
cdo*_*256 30
LSTM的输出是最后一层上所有隐藏节点的输出.
hidden_size
- 每层LSTM块的数量.
input_size
- 每个时间步的输入要素数.
num_layers
- 隐藏层数.
总共有hidden_size * num_layers
LSTM块.
输入的尺寸是(seq_len, batch, input_size)
.
seq_len
- 每个输入流中的时间步数.
batch
- 每批输入序列的大小.
隐藏和单元格尺寸为: (num_layers, batch, hidden_size)
output(seq_len,batch,hidden_size*num_directions):包含来自RNN最后一层的输出要素(h_t)的张量,每个t.
所以会有hidden_size * num_directions
产出.你没有初始化RNN是双向的所以num_directions
是1.所以output_size = hidden_size
.
编辑:您可以使用线性图层更改输出数量:
out_rnn, hn = rnn(input, (h0, c0))
lin = nn.Linear(hidden_size, output_size)
v1 = nn.View(seq_len*batch, hidden_size)
v2 = nn.View(seq_len, batch, output_size)
output = v2(lin(v1(out_rnn)))
Run Code Online (Sandbox Code Playgroud)
注意:对于这个答案,我假设我们只讨论非双向LSTM.
来源:PyTorch文档.
cdo256的回答几乎是正确的。当他提到hidden_size的含义时,他是错误的。他解释为:
hidden_size-每层的LSTM块数。
但实际上,这是一个更好的解释:
单元中的每个S型,tanh或隐藏状态层实际上是一组节点,其数量等于隐藏层的大小。因此,LSTM单元中的每个“节点”实际上都是正常神经网络节点的集群,就像在紧密连接的神经网络的每一层中一样。因此,如果您设置hidden_size = 10,则每个LSTM块或单元将具有其中包含10个节点的神经网络。LSTM模型中LSTM块的总数将等于序列长度的总数。
通过分析nn.LSTM和nn.LSTMCell之间的示例差异可以看出这一点:
https://pytorch.org/docs/stable/nn.html#torch.nn.LSTM
和
https://pytorch.org/docs/stable/nn.html#torch.nn.LSTMCell