如何在Tensorflow中使用多层双向LSTM?

Gi *_*hin 8 bidirectional multi-layer lstm tensorflow recurrent-neural-network

我想知道如何在Tensorflow中使用多层双向LSTM.

我已经实现了双向LSTM的内容,但我想将此模型与添加的多层模型进行比较.

我该如何在这部分中添加一些代码?

x = tf.unstack(tf.transpose(x, perm=[1, 0, 2]))
#print(x[0].get_shape())

# Define lstm cells with tensorflow
# Forward direction cell
lstm_fw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
# Backward direction cell
lstm_bw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)

# Get lstm cell output
try:
    outputs, _, _ = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
                                          dtype=tf.float32)
except Exception: # Old TensorFlow version only returns outputs not states
    outputs = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
                                    dtype=tf.float32)

# Linear activation, using rnn inner loop last output
outputs = tf.stack(outputs, axis=1)
outputs = tf.reshape(outputs, (batch_size*n_steps, n_hidden*2))
outputs = tf.matmul(outputs, weights['out']) + biases['out']
outputs = tf.reshape(outputs, (batch_size, n_steps, n_classes))
Run Code Online (Sandbox Code Playgroud)

小智 5

您可以使用两种不同的方法来应用多层bilstm模型:

1)将上一个bilstm层中的内容用作下一个bilstm的输入。在开始时,您应该创建具有长度为num_layers的正向和反向单元格的数组。和

for n in range(num_layers):
        cell_fw = cell_forw[n]
        cell_bw = cell_back[n]

        state_fw = cell_fw.zero_state(batch_size, tf.float32)
        state_bw = cell_bw.zero_state(batch_size, tf.float32)

        (output_fw, output_bw), last_state = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, output,
                                                                             initial_state_fw=state_fw,
                                                                             initial_state_bw=state_bw,
                                                                             scope='BLSTM_'+ str(n),
                                                                             dtype=tf.float32)

        output = tf.concat([output_fw, output_bw], axis=2)
Run Code Online (Sandbox Code Playgroud)

2)同样值得一提的是堆叠bilstm的另一种方法。

  • 我尝试了此操作,并得到以下错误:ValueError:变量bidirectional_rnn / fw / lstm_cell / kernel已经存在,不允许使用。您是要在VarScope中设置“ reuse = True”吗?你能提供一个可行的例子吗? (2认同)

mni*_*nis 5

这基本上与第一个答案相同,但作用域名称的用法略有不同,并添加了辍学包装器。它还可以解决第一个答案给出的关于变量范围的错误。

def bidirectional_lstm(input_data, num_layers, rnn_size, keep_prob):

    output = input_data
    for layer in range(num_layers):
        with tf.variable_scope('encoder_{}'.format(layer),reuse=tf.AUTO_REUSE):

            # By giving a different variable scope to each layer, I've ensured that
            # the weights are not shared among the layers. If you want to share the
            # weights, you can do that by giving variable_scope as "encoder" but do
            # make sure first that reuse is set to tf.AUTO_REUSE

            cell_fw = tf.contrib.rnn.LSTMCell(rnn_size, initializer=tf.truncated_normal_initializer(-0.1, 0.1, seed=2))
            cell_fw = tf.contrib.rnn.DropoutWrapper(cell_fw, input_keep_prob = keep_prob)

            cell_bw = tf.contrib.rnn.LSTMCell(rnn_size, initializer=tf.truncated_normal_initializer(-0.1, 0.1, seed=2))
            cell_bw = tf.contrib.rnn.DropoutWrapper(cell_bw, input_keep_prob = keep_prob)

            outputs, states = tf.nn.bidirectional_dynamic_rnn(cell_fw, 
                                                              cell_bw, 
                                                              output,
                                                              dtype=tf.float32)

            # Concat the forward and backward outputs
            output = tf.concat(outputs,2)

    return output
Run Code Online (Sandbox Code Playgroud)