Sha*_*hai 12 machine-learning neural-network deep-learning caffe batch-normalization
我对如何"BatchNorm"在模型中使用/插入图层感到有点困惑.
我看到了几种不同的方法,例如:
"BatchNorm"+ "Scale"(无参数共享)"BatchNorm"图层紧跟着"Scale"图层:
layer {
bottom: "res2a_branch1"
top: "res2a_branch1"
name: "bn2a_branch1"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "res2a_branch1"
top: "res2a_branch1"
name: "scale2a_branch1"
type: "Scale"
scale_param {
bias_term: true
}
}
Run Code Online (Sandbox Code Playgroud)
"BatchNorm"在提供caffe的cifar10示例中,"BatchNorm"使用时没有任何"Scale"跟随它:
layer {
name: "bn1"
type: "BatchNorm"
bottom: "pool1"
top: "bn1"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
Run Code Online (Sandbox Code Playgroud)
batch_norm_param的TRAIN和TESTbatch_norm_param: use_global_scale在TRAIN和TEST阶段之间改变:
layer {
name: "bn1"
type: "BatchNorm"
bottom: "pool1"
top: "bn1"
batch_norm_param {
use_global_stats: false
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
include {
phase: TRAIN
}
}
layer {
name: "bn1"
type: "BatchNorm"
bottom: "pool1"
top: "bn1"
batch_norm_param {
use_global_stats: true
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
include {
phase: TEST
}
}
Run Code Online (Sandbox Code Playgroud)
如何"BatchNorm"在咖啡馆中使用图层?
如果您按照原始纸张进行批量标准化,则应遵循"比例"和"偏移"图层(可以通过"比例"包含偏差,但这会使"偏差"参数无法访问).use_global_stats也应该从训练(False)更改为测试/部署(True) - 这是默认行为.请注意,您提供的第一个示例是用于部署的原型文本,因此将其设置为True是正确的.
我不确定共享参数.
我提出了一个拉取请求来改进批量规范化的文档,但后来因为我想修改它而关闭它.然后,我再也没有回过头来.
请注意,我觉得lr_mult: 0对于"BatchNorm"不再需要(也许是不允许的?),虽然我不是现在发现相应的PR.
| 归档时间: |
|
| 查看次数: |
11321 次 |
| 最近记录: |