是batchnorm势头约定(默认值= 0.1),正确的,因为在其他的库,例如Tensorflow这似乎通常是在默认情况下为0.9或0.99?或许我们只是使用不同的约定?
python neural-network deep-learning pytorch batch-normalization
我有以下架构:
Conv1
Relu1
Pooling1
Conv2
Relu2
Pooling3
FullyConnect1
FullyConnect2
Run Code Online (Sandbox Code Playgroud)
我的问题是,我在哪里应用批量标准化?在 TensorFlow 中执行此操作的最佳功能是什么?
python machine-learning conv-neural-network tensorflow batch-normalization
torch.nn有班BatchNorm1d,BatchNorm2d,BatchNorm3d,但它并没有完全连接BatchNorm类?在PyTorch中执行普通Batch Norm的标准方法是什么?
python neural-network deep-learning pytorch batch-normalization
tf.layers.batch_normalization中"可训练"和"训练"标志的意义是什么?在训练和预测期间,这两者有何不同?
该model.eval()方法修改了某些在训练和推理过程中需要表现不同的模块(层)。文档中列出了一些示例:
这仅对某些模块有影响。请参阅特定模块的文档,了解其在培训/评估模式下的行为详细信息(如果它们受到影响),例如
Dropout、BatchNorm等。
是否有受影响模块的详尽列表?
我想知道批量归一化和自归一化神经网络之间的区别。换句话说,SELU(缩放指数线性单元)会取代批量归一化吗?
此外,在查看 SELU 激活值后,我发现它们在以下范围内:[-1, 1]. 虽然批量标准化不是这种情况。取而代之的是,BN层之后(relu 激活之前)的值采用了[-a, a]大约的值,而不是[-1, 1]。
以下是我在 SELU 激活后和批处理规范层后打印值的方式:
batch_norm_layer = tf.Print(batch_norm_layer,
data=[tf.reduce_max(batch_norm_layer), tf.reduce_min(batch_norm_layer)],
message = name_scope + ' min and max')
Run Code Online (Sandbox Code Playgroud)
和 SELU 激活的类似代码......
Batch norm层定义如下:
def batch_norm(x, n_out, phase_train, in_conv_layer = True):
with tf.variable_scope('bn'):
beta = tf.Variable(tf.constant(0.0, shape=n_out),
name='beta', trainable=True)
gamma = tf.Variable(tf.constant(1.0, shape=n_out),
name='gamma', trainable=True)
if in_conv_layer:
batch_mean, batch_var = tf.nn.moments(x, [0, 1, 2], name='moments')
else:
batch_mean, batch_var = tf.nn.moments(x, [0, 1], name='moments')
ema = tf.train.ExponentialMovingAverage(decay=0.9999) …Run Code Online (Sandbox Code Playgroud) 对于测试期间的批量归一化,如何计算每个激活输入(在每一层和输入维度)的均值和方差?是记录训练的均值和方差,计算整个训练集的均值和方差,还是计算整个测试集的均值和方差?
很多人说你要预先计算均值和方差,但是如果你使用计算整个测试集均值和方差的方法,是不是在进行前向传播的时候就需要计算整个测试集的均值和方差(不是“预”)?
非常感谢您的帮助!
machine-learning normalization neural-network deep-learning batch-normalization
我对张量流很困惑tf.layers.batch_normalization.
我的代码如下:
def my_net(x, num_classes, phase_train, scope):
x = tf.layers.conv2d(...)
x = tf.layers.batch_normalization(x, training=phase_train)
x = tf.nn.relu(x)
x = tf.layers.max_pooling2d(...)
# some other staffs
...
# return
return x
def train():
phase_train = tf.placeholder(tf.bool, name='phase_train')
image_node = tf.placeholder(tf.float32, shape=[batch_size, HEIGHT, WIDTH, 3])
images, labels = data_loader(train_set)
val_images, val_labels = data_loader(validation_set)
prediction_op = my_net(image_node, num_classes=2,phase_train=phase_train, scope='Branch1')
loss_op = loss(...)
# some other staffs
optimizer = tf.train.AdamOptimizer(base_learning_rate)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_op = optimizer.minimize(loss=total_loss, global_step=global_step)
sess = ...
coord …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用带有 U-net 的批量归一化层来进行分割任务。相同的层适用于 res-net、vgg、xception 等,我很好奇这是否是架构相关的问题?在训练期间一切都很好,指标会增加损失 dpor,但是一旦我尝试评估模型或预测掩码,它就会产生垃圾。即使在测试和预测期间,这些层的学习权重似乎也在不断更新。如何在keras中解决这个问题?keras 版本 = 2.2.2
我试图仅在编码器部分使用 Batch norm 层,没有帮助。我也试图设置层参数:trainable=False,没有帮助。
from keras.models import Input, Model
from keras.layers import Conv2D, Concatenate, MaxPooling2D
from keras.layers import UpSampling2D, Dropout, BatchNormalization
def conv_block(m, dim, res, do=0):
n = Conv2D(dim, 3, padding='same')(m)
n = BatchNormalization()(n)
n = keras.layers.LeakyReLU(0)(n)
n = Dropout(do)(n) if do else n
n = Conv2D(dim, 3, padding='same')(n)
n = BatchNormalization()(n)
n = keras.layers.LeakyReLU(0)(n)
return Concatenate()([m, n]) if res else n
def conv_block_bn(m, dim, res, do=0):
n = Conv2D(dim, 3, …Run Code Online (Sandbox Code Playgroud) 我正在实现一个带有自定义批量重整化层的 Keras 模型,它有 4 个权重(beta、gamma、running_mean 和 running_std)和 3 个状态变量(r_max、d_max 和 t):
self.gamma = self.add_weight(shape = shape, #NK - shape = shape
initializer=self.gamma_init,
regularizer=self.gamma_regularizer,
name='{}_gamma'.format(self.name))
self.beta = self.add_weight(shape = shape, #NK - shape = shape
initializer=self.beta_init,
regularizer=self.beta_regularizer,
name='{}_beta'.format(self.name))
self.running_mean = self.add_weight(shape = shape, #NK - shape = shape
initializer='zero',
name='{}_running_mean'.format(self.name),
trainable=False)
# Note: running_std actually holds the running variance, not the running std.
self.running_std = self.add_weight(shape = shape, initializer='one',
name='{}_running_std'.format(self.name),
trainable=False)
self.r_max = K.variable(np.ones((1,)), name='{}_r_max'.format(self.name))
self.d_max = K.variable(np.zeros((1,)), name='{}_d_max'.format(self.name))
self.t = …Run Code Online (Sandbox Code Playgroud) python machine-learning neural-network keras batch-normalization