RNN模型在TensorFlow中耗尽内存

Maa*_*ten 2 tensorflow

我使用TensorFlow中的rnn.rnn帮助器实现了Sequence to Sequence模型.

with tf.variable_scope("rnn") as scope, tf.device("/gpu:0"):
    cell = tf.nn.rnn_cell.BasicLSTMCell(4096)
    lstm = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)

    _, cell = rnn.rnn(lstm, input_vectors, dtype=tf.float32)
    tf.get_variable_scope().reuse_variables()
    lstm_outputs, _ = rnn.rnn(lstm, output_vectors, initial_state=cell)
Run Code Online (Sandbox Code Playgroud)

该模型在具有16 GB内存的Titan X上耗尽内存,同时为LSTM单元分配渐变:

W tensorflow/core/kernels/matmul_op.cc:158] Resource exhausted: OOM when allocating tensor with shape[8192,16384]
W tensorflow/core/common_runtime/executor.cc:1102] 0x2b42f00 Compute status: Resource exhausted: OOM when allocating tensor with shape[8192,16384]
     [[Node: gradients/rnn/RNN/MultiRNNCell_1/Cell0/BasicLSTMCell/Linear/MatMul_grad/MatMul_1 = MatMul[T=DT_FLOAT, transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](rnn/RNN/MultiRNNCell_1/Cell0/BasicLSTMCell/Linear/concat, gradients/rnn/RNN/MultiRNNCell_1/Cell0/BasicLSTMCell/add_grad/tuple/control_dependency)]]
Run Code Online (Sandbox Code Playgroud)

如果我将输入和输出序列的长度减少到4或更少,模型运行没有问题.

这向我表明TF正在尝试同时为所有时间步骤分配梯度.有没有办法避免这种情况?

Maa*_*ten 5

函数tf.gradients以及minimize优化器的方法允许您设置调用的参数aggregation_method.默认值为ADD_N.该方法以这样的方式构造图形,即所有梯度需要同时计算.

还有另外两个名为tf.AggregationMethod.EXPERIMENTAL_TREEand的 未记录方法 tf.AggregationMethod.EXPERIMENTAL_ACCUMULATE_N,它们没有这个要求.