通过嵌套 tf.map_fn 反向传播梯度

Question

通过嵌套 tf.map_fn 反向传播梯度

Gab*_*ele 5 gradient nested backpropagation map-function tensorflow

我想在与维度为[batch_size, H, W, n_channels]的矩阵中每个像素的深度通道相对应的每个向量上映射一个 TensorFlow 函数。

换句话说，对于批次中大小为H x W的每个图像：

我提取一些具有相同大小H x W的特征图F_k （其数量为 n_channels）（因此，特征图一起是形状为[H, W, n_channels]的张量；

然后，我希望将自定义函数应用于与每个特征图F_k的第 i行和第 j列相关联的向量v_ij，但探索整个深度通道（例如v的尺寸为[1 x 1 x n_channels]）。理想情况下，所有这些都会并行发生。

下面有一张解释该过程的图片。与图片的唯一区别是输入和输出“感受野”的大小均为 1x1（独立地将函数应用于每个像素）。

这类似于对矩阵应用 1x1 卷积；但是，我需要在深度通道上应用更通用的函数，而不是简单的求和运算。

我认为tf.map_fn()可能是一个选项，我尝试了以下解决方案，其中我递归地使用tf.map_fn()来访问与每个像素相关的功能。然而，这种似乎不是最优的，最重要的是，它在尝试反向传播梯度时会引发错误。

您知道发生这种情况的原因以及我应该如何构建代码以避免错误吗？

这是我当前对该功能的实现：

import tensorflow as tf from tensorflow import layers def apply_function_on_pixel_features(incoming): # at first the input is [None, W, H, n_channels] if len(incoming.get_shape()) > 1: return tf.map_fn(lambda x: apply_function_on_pixel_features(x), incoming) else: # here the input is [n_channels] # apply some function that applies a transfomration and returns a vetor of the same size output = my_custom_fun(incoming) # my_custom_fun() doesn't change the shape return output
Run Code Online (Sandbox Code Playgroud)
和我的代码主体：

H = 128 W = 132 n_channels = 8 x1 = tf.placeholder(tf.float32, [None, H, W, 1]) x2 = layers.conv2d(x1, filters=n_channels, kernel_size=3, padding='same') # now apply a function to the features vector associated to each pixel x3 = apply_function_on_pixel_features(x2) x4 = tf.nn.softmax(x3) loss = cross_entropy(x4, labels) optimizer = tf.train.AdamOptimizer(lr) train_op = optimizer.minimize(loss) # <--- ERROR HERE!
Run Code Online (Sandbox Code Playgroud)
特别是，错误如下：

File "/home/venvs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2481, in AddOp self._AddOpInternal(op) File "/home/venvs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2509, in _AddOpInternal self._MaybeAddControlDependency(op) File "/home/venvs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2547, in _MaybeAddControlDependency op._add_control_input(self.GetControlPivot().op) AttributeError: 'NoneType' object has no attribute 'op'
Run Code Online (Sandbox Code Playgroud)
整个错误堆栈和代码可以在这里找到。谢谢您的帮助，

G。

更新：

根据@thushv89的建议，我添加了一个可能的问题解决方案。我仍然不知道为什么我以前的代码不起作用。对此的任何见解仍然非常感激。

Answer 1

thu*_*v89 1

@gabriele关于必须依赖batch_size，您是否尝试过按照以下方式进行操作？该函数不依赖于batch_size。您可以将其替换map_fn为您喜欢的任何内容。

def apply_function_on_pixel_features(incoming):

    # get input shape:
    _, W, H, C = incoming.get_shape().as_list()
    incoming_flat = tf.reshape(incoming, shape=[-1, C])

    # apply function on every vector of shape [1, C]
    out_matrix = tf.map_fn(lambda x: x+1, incoming_flat)  # dimension remains unchanged

    # go back to the input shape shape [None, W, H, C]
    out_matrix = tf.reshape(out_matrix, shape=[-1, W, H, C])

    return out_matrix

Run Code Online (Sandbox Code Playgroud)

我测试的完整代码如下。

import numpy as np
import tensorflow as tf
from tensorflow.keras.losses import categorical_crossentropy

def apply_function_on_pixel_features(incoming):

    # get input shape:
    _, W, H, C = incoming.get_shape().as_list()
    incoming_flat = tf.reshape(incoming, shape=[-1])

    # apply function on every vector of shape [1, C]
    out_matrix = tf.map_fn(lambda x: x+1, incoming_flat)  # dimension remains unchanged

    # go back to the input shape shape [None, W, H, C]
    out_matrix = tf.reshape(out_matrix, shape=[-1, W, H, C])

    return out_matrix

H = 32
W = 32
x1 = tf.placeholder(tf.float32, [None, H, W, 1])
labels = tf.placeholder(tf.float32, [None, 10])
x2 = tf.layers.conv2d(x1, filters=1, kernel_size=3, padding='same')

# now apply a function to the features vector associated to each pixel
x3 = apply_function_on_pixel_features(x2)  
x4 = tf.layers.flatten(x3)
x4 = tf.layers.dense(x4, units=10, activation='softmax')

loss = categorical_crossentropy(labels, x4)
optimizer = tf.train.AdamOptimizer(0.001)
train_op = optimizer.minimize(loss)


x = np.zeros(shape=(10, H, W, 1))
y = np.random.choice([0,1], size=(10, 10))


with tf.Session() as sess:
  tf.global_variables_initializer().run()
  sess.run(train_op, feed_dict={x1: x, labels:y})

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年前
查看次数：	1067 次
最近记录：	6 年前