Caffe Python Layer中的后向传递没有被调用/工作?

Dav*_*utz 3 neural-network deep-learning caffe pycaffe

我没有成功尝试使用Caffe在Python中实现一个简单的丢失层.作为参考,我发现了几个用Python实现的层,包括这里,这里这里.

EuclideanLossLayerCaffe文档/示例提供的开头,我无法使其正常工作并开始调试.即使使用这个简单TestLayer:

def setup(self, bottom, top):
    """
    Checks the correct number of bottom inputs.

    :param bottom: bottom inputs
    :type bottom: [numpy.ndarray]
    :param top: top outputs
    :type top: [numpy.ndarray]
    """

    print 'setup'

def reshape(self, bottom, top):
    """
    Make sure all involved blobs have the right dimension.

    :param bottom: bottom inputs
    :type bottom: caffe._caffe.RawBlobVec
    :param top: top outputs
    :type top: caffe._caffe.RawBlobVec
    """

    print 'reshape'
    top[0].reshape(bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3])

def forward(self, bottom, top):
    """
    Forward propagation.

    :param bottom: bottom inputs
    :type bottom: caffe._caffe.RawBlobVec
    :param top: top outputs
    :type top: caffe._caffe.RawBlobVec
    """

    print 'forward'
    top[0].data[...] = bottom[0].data

def backward(self, top, propagate_down, bottom):
    """
    Backward pass.

    :param bottom: bottom inputs
    :type bottom: caffe._caffe.RawBlobVec
    :param propagate_down:
    :type propagate_down:
    :param top: top outputs
    :type top: caffe._caffe.RawBlobVec
    """

    print 'backward'
    bottom[0].diff[...] = top[0].diff[...]
Run Code Online (Sandbox Code Playgroud)

我无法使Python层工作.学习任务相当简单,因为我只是想预测一个实数值是正数还是负数.相应的数据生成如下并写入LMDB:

N = 10000
N_train = int(0.8*N)

images = []
labels = []

for n in range(N):            
    image = (numpy.random.rand(1, 1, 1)*2 - 1).astype(numpy.float)
    label = int(numpy.sign(image))

    images.append(image)
    labels.append(label)
Run Code Online (Sandbox Code Playgroud)

将数据写入LMDB应该是正确的,因为使用Caffe提供的MNIST数据集的测试没有显示任何问题.网络定义如下:

 net.data, net.labels = caffe.layers.Data(batch_size = batch_size, backend = caffe.params.Data.LMDB, 
                                                source = lmdb_path, ntop = 2)
 net.fc1 = caffe.layers.Python(net.data, python_param = dict(module = 'tools.layers', layer = 'TestLayer'))
 net.score = caffe.layers.TanH(net.fc1)
 net.loss = caffe.layers.EuclideanLoss(net.score, net.labels)
Run Code Online (Sandbox Code Playgroud)

使用以下方法手动完成求解:

for iteration in range(iterations):
    solver.step(step)
Run Code Online (Sandbox Code Playgroud)

相应的原型文件如下:

solver.prototxt:

weight_decay: 0.0005
test_net: "tests/test.prototxt"
snapshot_prefix: "tests/snapshot_"
max_iter: 1000
stepsize: 1000
base_lr: 0.01
snapshot: 0
gamma: 0.01
solver_mode: CPU
train_net: "tests/train.prototxt"
test_iter: 0
test_initialization: false
lr_policy: "step"
momentum: 0.9
display: 100
test_interval: 100000
Run Code Online (Sandbox Code Playgroud)

train.prototxt:

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "labels"
  data_param {
    source: "tests/train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "fc1"
  type: "Python"
  bottom: "data"
  top: "fc1"
  python_param {
    module: "tools.layers"
    layer: "TestLayer"
  }
}
layer {
  name: "score"
  type: "TanH"
  bottom: "fc1"
  top: "score"
}
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "score"
  bottom: "labels"
  top: "loss"
}
Run Code Online (Sandbox Code Playgroud)

test.prototxt:

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "labels"
  data_param {
    source: "tests/test_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "fc1"
  type: "Python"
  bottom: "data"
  top: "fc1"
  python_param {
    module: "tools.layers"
    layer: "TestLayer"
  }
}
layer {
  name: "score"
  type: "TanH"
  bottom: "fc1"
  top: "score"
}
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "score"
  bottom: "labels"
  top: "loss"
}
Run Code Online (Sandbox Code Playgroud)

我试图跟踪下来,将在调试消息backwardfoward方法TestLayer,只有forward方法解决(注意,没有测试进行,呼叫只能与OT问题)时被调用.同样我添加了调试消息python_layer.hpp:

virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
    const vector<Blob<Dtype>*>& top) {
  LOG(INFO) << "cpp forward";
  self_.attr("forward")(bottom, top);
}
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
    const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
  LOG(INFO) << "cpp backward";
  self_.attr("backward")(top, propagate_down, bottom);
}
Run Code Online (Sandbox Code Playgroud)

同样,只执行前向传递.当我删除backward方法时TestLayer,解决仍然有效.删除forward方法时,会抛出错误,因为forward未实现.我期望同样的backward,所以似乎后向传递根本没有被执行.切换回常规层并添加调试消息,一切都按预期工作.

我觉得我错过了一些简单或基本的东西,但我现在几天都无法解决问题.所以任何帮助或提示都表示赞赏.

谢谢!

小智 5

这是预期的行为,因为您的python图层下方没有任何实际需要渐变来计算权重更新的图层.Caffe注意到这一点,并跳过这些层的反向计算,因为这会浪费时间.

如果在网络初始化时日志中需要反向计算,则Caffe将打印所有图层.在你的情况下,你应该看到类似的东西:

fc1 does not need backward computation.
Run Code Online (Sandbox Code Playgroud)

如果在"Python"层(例如Data->InnerProduct->Python->Loss)下放置"InnerProduct"或"Convolution"图层,则需要进行反向计算,并调用后向方法.