将Tensorflow模型转换为Caffe模型

Evi*_*Evi 25 caffe tensorflow

我希望能够将Tensorflow模型转换为Caffe模型.

我在谷歌搜索但我只能找到从caffe到tensorflow的转换器,但不是相反的.

有没有人知道如何做到这一点?

谢谢,Evi

小智 6

我遇到了同样的问题,并找到了解决方案。可以在这里找到代码(https://github.com/lFatality/tensorflow2caffe),并且我还在一些Youtube视频中记录了该代码。


Part 1 covers the creation of the architecture of VGG-19 in Caffe and tflearn (higher level API for TensorFlow, with some changes to the code native TensorFlow should also work).


In Part 2 the export of the weights and biases out of the TensorFlow model into a numpy file is described. In tflearn you can get the weights of a layer like this:

#get parameters of a certain layer
conv2d_vars = tflearn.variables.get_layer_variables_by_name(layer_name)
#get weights out of the parameters
weights = model.get_weights(conv2d_vars[0])
#get biases out of the parameters
biases = model.get_weights(conv2d_vars[1])
Run Code Online (Sandbox Code Playgroud)

For a convolutional layer, the layer_name is Conv_2D. Fully-Connected layers are called FullyConnected. If you use more than one layer of a certain type, a raising integer with a preceding underscore is used (e.g. the 2nd conv layer is called Conv_2D_1). I've found these names in the graph of the TensorBoard. If you name the layers in your architecture definition, then these layer_names might change to the names you defined.

In native TensorFlow the export will need different code but the format of the parameters should be the same so subsequent steps should still be applicable.


Part 3 covers the actual conversion. What's critical is the conversion of the weights when you create the caffemodel (the biases can be carried over without change). TensorFlow and Caffe use different formats when saving a filter. While TensorFlow uses [height, width, depth, number of filters] (TensorFlow docs, at the bottom), Caffe uses [number of filters, depth, height, width] (Caffe docs, chapter 'Blob storage and communication'). To convert between the formats you can use the transpose function (for example: weights_of_first_conv_layer.transpose((3,2,0,1)). The 3,2,0,1 sequence can be obtained by enumerating the TensorFlow format (origin) and then switching it to the Caffe format (target format) while keeping the numbers at their specific variable.).
If you want to connect a tensor output to a fully-connected layer, things get a little tricky. If you use VGG-19 with an input size of 112x112 it looks like this.

fc1_weights = data_file[16][0].reshape((4,4,512,4096))
fc1_weights = fc1_w.transpose((3,2,0,1))
fc1_weights = fc1_w.reshape((4096,8192))
Run Code Online (Sandbox Code Playgroud)

What you get from TensorFlow if you export the parameters at the connection between tensor and fully-connected layer is an array with the shape [entries in the tensor, units in the fc-layer] (here: [8192, 4096]). You have to find out what the shape of your output tensor is and then reshape the array so that it fits the TensorFlow format (see above, number of filters being the number of units in the fc-layer). After that you use the transpose-conversion you've used previously and then reshape the array again, but the other way around. While TensorFlow saves fc-layer weights as [number of inputs, number of outputs], Caffe does it the other way around.
If you connect two fc-layers to each other, you don't have to do the complex process previously described but you will have to account for the different fc-layer format by transposing again (fc_layer_weights.transpose((1,0)))

You can then set the parameters of the network using

net.params['layer_name_in_prototxt'][0].data[...] = weights
net.params['layer_name_in_prototxt'][1].data[...] = biases
Run Code Online (Sandbox Code Playgroud)

这是一个快速概述。如果您需要所有代码,则在我的github存储库中。希望对您有所帮助。:)


干杯,
死亡


Jay*_*wal 4

正如@Patwie 的评论中所建议的,您必须通过逐层复制权重来手动完成。例如,要将第一个卷积层权重从张量流检查点复制到 caffemodel,您必须执行如下操作:

sess = tf.Session()
new_saver = tf.train.import_meta_graph("/path/to/checkpoint.meta")
what = new_saver.restore(sess, "/path/to/checkpoint")

all_vars = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)

conv1 = all_vars[0]
bias1 = all_vars[1]

conv_w1, bias_1 = sess.run([conv1,bias1])

net = caffe.Net('path/to/conv.prototxt', caffe.TEST)

net.params['conv_1'][0].data[...] = conv_w1
net.params['conv_1'][1].data[...] = bias_1

...

net.save('modelfromtf.caffemodel')
Run Code Online (Sandbox Code Playgroud)

注1:此代码尚未经过测试。我不确定这是否有效,但我认为应该有效。此外,这仅适用于一个转换层。在实践中,您必须首先分析张量流检查点以检查哪些层权重位于哪个索引(打印all_vars),然后单独复制每个层的权重。

注2:一些自动化可以通过迭代初始卷积层来完成,因为它们通常遵循设定的模式(conv1->bn1->relu1->conv2->bn2->relu2...)

注3:Tensorflow可以进一步将每个层的权重划分为单独的索引。例如:如上所示,卷积层的权重偏差是分开的。此外,gamma均值方差对于批量归一化层是分开的。