Ami*_*pta 9 tensorflow batch-normalization
tf.layers.batch_normalization中"可训练"和"训练"标志的意义是什么?在训练和预测期间,这两者有何不同?
批处理规范分为两个阶段:
1. Training:
- Normalize layer activations using `moving_avg`, `moving_var`, `beta` and `gamma`
(`training`* should be `True`.)
- update the `moving_avg` and `moving_var` statistics.
(`trainable` should be `True`)
2. Inference:
- Normalize layer activations using `beta` and `gamma`.
(`training` should be `False`)
Run Code Online (Sandbox Code Playgroud)
示例代码说明了几种情况:
#random image
img = np.random.randint(0,10,(2,2,4)).astype(np.float32)
# batch norm params initialized
beta = np.ones((4)).astype(np.float32)*1 # all ones
gamma = np.ones((4)).astype(np.float32)*2 # all twos
moving_mean = np.zeros((4)).astype(np.float32) # all zeros
moving_var = np.ones((4)).astype(np.float32) # all ones
#Placeholders for input image
_input = tf.placeholder(tf.float32, shape=(1,2,2,4), name='input')
#batch Norm
out = tf.layers.batch_normalization(
_input,
beta_initializer=tf.constant_initializer(beta),
gamma_initializer=tf.constant_initializer(gamma),
moving_mean_initializer=tf.constant_initializer(moving_mean),
moving_variance_initializer=tf.constant_initializer(moving_var),
training=False, trainable=False)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
init_op = tf.global_variables_initializer()
## 2. Run the graph in a session
with tf.Session() as sess:
# init the variables
sess.run(init_op)
for i in range(2):
ops, o = sess.run([update_ops, out], feed_dict={_input: np.expand_dims(img, 0)})
print('beta', sess.run('batch_normalization/beta:0'))
print('gamma', sess.run('batch_normalization/gamma:0'))
print('moving_avg',sess.run('batch_normalization/moving_mean:0'))
print('moving_variance',sess.run('batch_normalization/moving_variance:0'))
print('out', np.round(o))
print('')
Run Code Online (Sandbox Code Playgroud)
何时training=False和trainable=False:
img = [[[4., 5., 9., 0.]...
out = [[ 9. 11. 19. 1.]...
The activation is scaled/shifted using gamma and beta.
Run Code Online (Sandbox Code Playgroud)
何时training=True和trainable=False:
out = [[ 2. 2. 3. -1.] ...
The activation is normalized using `moving_avg`, `moving_var`, `gamma` and `beta`.
The averages are not updated.
Run Code Online (Sandbox Code Playgroud)
何时traning=True和trainable=True:
The out is same as above, but the `moving_avg` and `moving_var` gets updated to new values.
moving_avg [0.03249997 0.03499997 0.06499994 0.02749997]
moving_variance [1.0791667 1.1266665 1.0999999 1.0925]
Run Code Online (Sandbox Code Playgroud)
小智 5
这是相当复杂的。在 TF 2.0 中,行为发生了变化,请参见:
关于设置
layer.trainable = False一个上BatchNormalization层:setting 的意思
layer.trainable = False是冻结层,即在训练过程中它的内部状态不会改变:
它的可训练权重不会在fit()或 期间更新train_on_batch(),并且它的状态更新不会被运行。通常,这并不一定意味着层以推理
模式运行(通常由training调用层时可以传递的参数控制)。“冻结状态”和“推理模式”
是两个独立的概念。但是,在
BatchNormalization层的情况下,在层上设置
trainable = False意味着该层将
随后以推理模式运行(意味着它将使用移动均值和移动方差对当前批次进行归一化,
而不是使用均值和方差当前批次)。此行为已在 TensorFlow 2.0 中引入,以便能够layer.trainable = False在 convnet 微调用例中产生最常见的预期行为。注意:
- 此行为仅从 TensorFlow 2.0 开始发生。在 1.* 中,设置
layer.trainable = False会冻结图层但不会将其切换到推理模式。trainable在包含其他层的模型上设置将递归设置trainable所有内部层的值。- 如果
trainable在调用compile()模型后更改了属性值,则新值不会对该模型生效,直到compile()再次调用。
| 归档时间: |
|
| 查看次数: |
1679 次 |
| 最近记录: |