在卷积层中不能同时使用偏差和批量归一化

Mat*_*ský 16 python tensorflow

我使用slim框架进行张量流,因为它简单.但是我希望卷积层具有偏差和批量标准化.在vanilla tensorflow中,我有:

def conv2d(input_, output_dim, k_h=5, k_w=5, d_h=2, d_w=2, name="conv2d"):
    with tf.variable_scope(name):
        w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim],

    initializer=tf.contrib.layers.xavier_initializer(uniform=False))
    conv = tf.nn.conv2d(input_, w, strides=[1, d_h, d_w, 1], padding='SAME')

    biases = tf.get_variable('biases', [output_dim], initializer=tf.constant_initializer(0.0))
    conv = tf.reshape(tf.nn.bias_add(conv, biases), conv.get_shape())

    tf.summary.histogram("weights", w)
    tf.summary.histogram("biases", biases)

    return conv

d_bn1 = BatchNorm(name='d_bn1')
h1 = lrelu(d_bn1(conv2d(h0, df_dim + y_dim, name='d_h1_conv')))
Run Code Online (Sandbox Code Playgroud)

然后我把它重写为苗条:

h1 = slim.conv2d(h0,
                 num_outputs=self.df_dim + self.y_dim,
                 scope='d_h1_conv',
                 kernel_size=[5, 5],
                 stride=[2, 2],
                 activation_fn=lrelu,
                 normalizer_fn=layers.batch_norm,
                 normalizer_params=batch_norm_params,                           
                 weights_initializer=layers.xavier_initializer(uniform=False),
                 biases_initializer=tf.constant_initializer(0.0)
                 )
Run Code Online (Sandbox Code Playgroud)

但是这段代码不会给conv层增加偏见.那是因为https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/layers/python/layers/layers.py#L1025其中是

    layer = layer_class(filters=num_outputs,
                    kernel_size=kernel_size,
                    strides=stride,
                    padding=padding,
                    data_format=df,
                    dilation_rate=rate,
                    activation=None,
                    use_bias=not normalizer_fn and biases_initializer,
                    kernel_initializer=weights_initializer,
                    bias_initializer=biases_initializer,
                    kernel_regularizer=weights_regularizer,
                    bias_regularizer=biases_regularizer,
                    activity_regularizer=None,
                    trainable=trainable,
                    name=sc.name,
                    dtype=inputs.dtype.base_dtype,
                    _scope=sc,
                    _reuse=reuse)
    outputs = layer.apply(inputs)
Run Code Online (Sandbox Code Playgroud)

在构造层时,使用批量标准化时不会产生偏差.这是否意味着我不能同时使用slim和图层库进行偏差和批量标准化?或者是否有另一种方法可以在使用slim时实现层中的偏置和批量标准化?

Pat*_*wie 23

Batchnormalization已经包括增加偏差项.回顾一下BatchNorm已经是:

gamma * normalized(x) + bias
Run Code Online (Sandbox Code Playgroud)

因此,不需要(并且没有必要)在卷积层中添加另一个偏置项.简单来说,BatchNorm将激活的平均值移动.因此,任何常数都将被取消.

如果您仍想执行此操作,则需要删除normalizer_fn参数并将BatchNorm添加为单个图层.就像我说的,这没有任何意义.

但解决方案就是这样的

net = slim.conv2d(net, normalizer_fn=None, ...)
net = tf.nn.batch_normalization(net)
Run Code Online (Sandbox Code Playgroud)

请注意,BatchNorm依赖于梯度更新.因此,您需要使用与UPDATE_OPS集合兼容的优化器.或者您需要手动添加tf.control_dependencies.

简而言之:即使你实现了ConvWithBias + BatchNorm,它的行为也会像ConvWithoutBias + BatchNorm一样.它与多个完全连接的层相同,没有激活功能将表现为单个层.