如何在LSTM中实现Tensorflow批量规范化

Question

如何在LSTM中实现Tensorflow批量规范化

Ben*_*iBB 19 python neural-network lstm tensorflow rnn

我目前的LSTM网络看起来像这样.

rnn_cell = tf.contrib.rnn.BasicRNNCell(num_units=CELL_SIZE)
init_s = rnn_cell.zero_state(batch_size=1, dtype=tf.float32)  # very first hidden state
outputs, final_s = tf.nn.dynamic_rnn(
    rnn_cell,              # cell you have chosen
    tf_x,                  # input
    initial_state=init_s,  # the initial hidden state
    time_major=False,      # False: (batch, time step, input); True: (time step, batch, input)
)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(outputs, [-1, CELL_SIZE])
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

Run Code Online (Sandbox Code Playgroud)

通常,我申请tf.layers.batch_normalization批量标准化.但我不确定这是否适用于LSTM网络.

b1 = tf.layers.batch_normalization(outputs, momentum=0.4, training=True)
d1 = tf.layers.dropout(b1, rate=0.4, training=True)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(d1, [-1, CELL_SIZE])                       
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

Run Code Online (Sandbox Code Playgroud)

Answer 1

ngo*_*mao 3

如果您想对 RNN（LSTM 或 GRU）使用批归一化，您可以查看此实现，或阅读博客文章中的完整描述。

然而，在序列数据中，层归一化比批量归一化更具优势。具体来说，“批量归一化的效果取决于小批量大小，并且如何将其应用于循环网络并不明显”（来自Ba 等人的论文《层归一化》）。

对于层归一化，它对每层内的总输入进行归一化。您可以查看GRU 单元的层标准化的实现：

归档时间：	7 年，10 月前
查看次数：	1513 次
最近记录：	6 年前