我怎样才能在TensorFlow中使用批量标准化?

Sha*_*Lee 77 python tensorflow

我想在TensorFlow中使用批量规范化.我找到了相关的C++源代码core/ops/nn_ops.cc.但是,我没有在tensorflow.org上找到它.

BN在MLP和CNN中有不同的语义,所以我不确定这个BN究竟是做什么的.

我没有找到一个叫做的方法MovingMoments.

dga*_*dga 56

2016年7月更新 在TensorFlow中使用批量规范化的最简单方法是使用contrib/layers,tflearnslim中提供的更高级别接口.

以前的答案,如果你想DIY:自发布以来,此文档字符串已得到改进 - 请参阅主分支中文档注释,而不是您找到的那个.它特别澄清了它的输出tf.nn.moments.

您可以在batch_norm测试代码中看到一个非常简单的使用示例.对于更真实的使用示例,我已经包含在帮助程序类下面并使用我为自己使用而编写的注释(不提供保证!):

"""A helper class for managing batch normalization state.                   

This class is designed to simplify adding batch normalization               
(http://arxiv.org/pdf/1502.03167v3.pdf) to your model by                    
managing the state variables associated with it.                            

Important use note:  The function get_assigner() returns                    
an op that must be executed to save the updated state.                      
A suggested way to do this is to make execution of the                      
model optimizer force it, e.g., by:                                         

  update_assignments = tf.group(bn1.get_assigner(),                         
                                bn2.get_assigner())                         
  with tf.control_dependencies([optimizer]):                                
    optimizer = tf.group(update_assignments)                                

"""

import tensorflow as tf


class ConvolutionalBatchNormalizer(object):
  """Helper class that groups the normalization logic and variables.        

  Use:                                                                      
      ewma = tf.train.ExponentialMovingAverage(decay=0.99)                  
      bn = ConvolutionalBatchNormalizer(depth, 0.001, ewma, True)           
      update_assignments = bn.get_assigner()                                
      x = bn.normalize(y, train=training?)                                  
      (the output x will be batch-normalized).                              
  """

  def __init__(self, depth, epsilon, ewma_trainer, scale_after_norm):
    self.mean = tf.Variable(tf.constant(0.0, shape=[depth]),
                            trainable=False)
    self.variance = tf.Variable(tf.constant(1.0, shape=[depth]),
                                trainable=False)
    self.beta = tf.Variable(tf.constant(0.0, shape=[depth]))
    self.gamma = tf.Variable(tf.constant(1.0, shape=[depth]))
    self.ewma_trainer = ewma_trainer
    self.epsilon = epsilon
    self.scale_after_norm = scale_after_norm

  def get_assigner(self):
    """Returns an EWMA apply op that must be invoked after optimization."""
    return self.ewma_trainer.apply([self.mean, self.variance])

  def normalize(self, x, train=True):
    """Returns a batch-normalized version of x."""
    if train:
      mean, variance = tf.nn.moments(x, [0, 1, 2])
      assign_mean = self.mean.assign(mean)
      assign_variance = self.variance.assign(variance)
      with tf.control_dependencies([assign_mean, assign_variance]):
        return tf.nn.batch_norm_with_global_normalization(
            x, mean, variance, self.beta, self.gamma,
            self.epsilon, self.scale_after_norm)
    else:
      mean = self.ewma_trainer.average(self.mean)
      variance = self.ewma_trainer.average(self.variance)
      local_beta = tf.identity(self.beta)
      local_gamma = tf.identity(self.gamma)
      return tf.nn.batch_norm_with_global_normalization(
          x, mean, variance, local_beta, local_gamma,
          self.epsilon, self.scale_after_norm)
Run Code Online (Sandbox Code Playgroud)

请注意,我将其称为a,ConvolutionalBatchNormalizer因为它使用了tf.nn.moments对轴0,1和2进行求和,而对于非卷积使用,您可能只需要轴0.

如果你使用它,反馈表示赞赏.


Mat*_*htz 52

截至TensorFlow 1.0(2017年2月),还有tf.layers.batch_normalizationTensorFlow本身包含的高级API.

它使用起来非常简单:

# Set this to True for training and False for testing
training = tf.placeholder(tf.bool)

x = tf.layers.dense(input_x, units=100)
x = tf.layers.batch_normalization(x, training=training)
x = tf.nn.relu(x)
Run Code Online (Sandbox Code Playgroud)

...除了它为图形添加额外的操作(用于更新其均值和方差变量),使得它们不会成为训练操作的依赖关系.您可以单独运行ops:

extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
sess.run([train_op, extra_update_ops], ...)
Run Code Online (Sandbox Code Playgroud)

或者手动将更新操作添加为训练操作的依赖项,然后像往常一样运行训练操作:

extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(extra_update_ops):
    train_op = optimizer.minimize(loss)
...
sess.run([train_op], ...)
Run Code Online (Sandbox Code Playgroud)

  • @mamafoku Batch Norm算法需要计算整个训练集的均值和标准差.这些是在训练期间*计算*,但在训练期间,仅在推理期间不会*使用*.该计算使用指数平均值完成.它独立于训练的其余部分,因此您必须在每次训练迭代时"手动"运行指数平均计算步骤(即`extra_update_ops`),以及常规训练操作,或者您可以使训练操作依赖于` extra_update_ops`(使用`control_dependencies()`块).希望这可以帮助. (5认同)

bgs*_*shi 33

以下工作对我来说很好,它不需要在外面调用EMA-apply.

import numpy as np
import tensorflow as tf
from tensorflow.python import control_flow_ops

def batch_norm(x, n_out, phase_train, scope='bn'):
    """
    Batch normalization on convolutional maps.
    Args:
        x:           Tensor, 4D BHWD input maps
        n_out:       integer, depth of input maps
        phase_train: boolean tf.Varialbe, true indicates training phase
        scope:       string, variable scope
    Return:
        normed:      batch-normalized maps
    """
    with tf.variable_scope(scope):
        beta = tf.Variable(tf.constant(0.0, shape=[n_out]),
                                     name='beta', trainable=True)
        gamma = tf.Variable(tf.constant(1.0, shape=[n_out]),
                                      name='gamma', trainable=True)
        batch_mean, batch_var = tf.nn.moments(x, [0,1,2], name='moments')
        ema = tf.train.ExponentialMovingAverage(decay=0.5)

        def mean_var_with_update():
            ema_apply_op = ema.apply([batch_mean, batch_var])
            with tf.control_dependencies([ema_apply_op]):
                return tf.identity(batch_mean), tf.identity(batch_var)

        mean, var = tf.cond(phase_train,
                            mean_var_with_update,
                            lambda: (ema.average(batch_mean), ema.average(batch_var)))
        normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, 1e-3)
    return normed
Run Code Online (Sandbox Code Playgroud)

例:

import math

n_in, n_out = 3, 16
ksize = 3
stride = 1
phase_train = tf.placeholder(tf.bool, name='phase_train')
input_image = tf.placeholder(tf.float32, name='input_image')
kernel = tf.Variable(tf.truncated_normal([ksize, ksize, n_in, n_out],
                                   stddev=math.sqrt(2.0/(ksize*ksize*n_out))),
                                   name='kernel')
conv = tf.nn.conv2d(input_image, kernel, [1,stride,stride,1], padding='SAME')
conv_bn = batch_norm(conv, n_out, phase_train)
relu = tf.nn.relu(conv_bn)

with tf.Session() as session:
    session.run(tf.initialize_all_variables())
    for i in range(20):
        test_image = np.random.rand(4,32,32,3)
        sess_outputs = session.run([relu],
          {input_image.name: test_image, phase_train.name: True})
Run Code Online (Sandbox Code Playgroud)

  • 考虑到有官方的BN层,你的代码真的是必要的吗?代码:https://github.com/tensorflow/tensorflow/blob/b826b79718e3e93148c3545e7aa3f90891744cc0/tensorflow/contrib/layers/python/layers/layers.py#L100 (3认同)

Pin*_*hio 15

还有一个由开发人员编码的"官方"批量规范化层.他们没有关于如何使用它的非常好的文档,但这里是如何使用它(根据我):

from tensorflow.contrib.layers.python.layers import batch_norm as batch_norm

def batch_norm_layer(x,train_phase,scope_bn):
    bn_train = batch_norm(x, decay=0.999, center=True, scale=True,
    updates_collections=None,
    is_training=True,
    reuse=None, # is this right?
    trainable=True,
    scope=scope_bn)
    bn_inference = batch_norm(x, decay=0.999, center=True, scale=True,
    updates_collections=None,
    is_training=False,
    reuse=True, # is this right?
    trainable=True,
    scope=scope_bn)
    z = tf.cond(train_phase, lambda: bn_train, lambda: bn_inference)
    return z
Run Code Online (Sandbox Code Playgroud)

实际使用它你需要创建一个占位符train_phase,表明你是否处于训练或推理阶段(如train_phase = tf.placeholder(tf.bool, name='phase_train')).它的价值可以在推理或培训期间用以下内容填写tf.session:

test_error = sess.run(fetches=cross_entropy, feed_dict={x: batch_xtest, y_:batch_ytest, train_phase: False})
Run Code Online (Sandbox Code Playgroud)

或在培训期间:

sess.run(fetches=train_step, feed_dict={x: batch_xs, y_:batch_ys, train_phase: True})
Run Code Online (Sandbox Code Playgroud)

根据github的讨论,我很确定这是正确的.


似乎还有另一个有用的链接:

http://r2rt.com/implementing-batch-normalization-in-tensorflow.html


Mar*_*rek 12

您只需使用内置batch_norm层:

batch_norm = tf.cond(is_train, 
    lambda: tf.contrib.layers.batch_norm(prev, activation_fn=tf.nn.relu, is_training=True, reuse=None),
    lambda: tf.contrib.layers.batch_norm(prev, activation_fn =tf.nn.relu, is_training=False, reuse=True))
Run Code Online (Sandbox Code Playgroud)

其中prev是前一层的输出(可以是完全连接或卷积层),is_train是一个布尔占位符.然后使用batch_norm作为下一层的输入.


jro*_*ock 11

由于有人最近对此进行了编辑,我想澄清这不再是一个问题.

这个答案似乎不正确当phase_train设置为false时,它仍然会更新电子邮件的均值和方差.这可以使用以下代码段进行验证.

x = tf.placeholder(tf.float32, [None, 20, 20, 10], name='input')
phase_train = tf.placeholder(tf.bool, name='phase_train')

# generate random noise to pass into batch norm
x_gen = tf.random_normal([50,20,20,10])
pt_false = tf.Variable(tf.constant(True))

#generate a constant variable to pass into batch norm
y = x_gen.eval()

[bn, bn_vars] = batch_norm(x, 10, phase_train)

tf.initialize_all_variables().run()
train_step = lambda: bn.eval({x:x_gen.eval(), phase_train:True})
test_step = lambda: bn.eval({x:y, phase_train:False})
test_step_c = lambda: bn.eval({x:y, phase_train:True})

# Verify that this is different as expected, two different x's have different norms
print(train_step()[0][0][0])
print(train_step()[0][0][0])

# Verify that this is same as expected, same x's (y) have same norm
print(train_step_c()[0][0][0])
print(train_step_c()[0][0][0])

# THIS IS DIFFERENT but should be they same, should only be reading from the ema.
print(test_step()[0][0][0])
print(test_step()[0][0][0])
Run Code Online (Sandbox Code Playgroud)

  • 感谢您的更新,仍然无法评论您的线程(欢呼代表),但看起来它现在应该工作.感谢@ myme5261314. (2认同)