Ale*_*dro 80 python tensorflow
我正在将我的Caffe网络移植到TensorFlow,但它似乎没有xavier初始化.我正在使用,truncated_normal
但这似乎使得训练更加困难.
Sun*_*Kim 114
从版本0.8开始,有一个Xavier初始化程序,请参阅此处的文档.
你可以使用这样的东西:
W = tf.get_variable("W", shape=[784, 256],
initializer=tf.contrib.layers.xavier_initializer())
Run Code Online (Sandbox Code Playgroud)
Sau*_*tro 28
只是添加另一个关于如何tf.Variable
使用Xavier和Yoshua方法定义初始化的示例:
graph = tf.Graph()
with graph.as_default():
...
initializer = tf.contrib.layers.xavier_initializer()
w1 = tf.Variable(initializer(w1_shape))
b1 = tf.Variable(initializer(b1_shape))
...
Run Code Online (Sandbox Code Playgroud)
这使我nan
在使用具有RELU的多个层时由于数值不稳定而使我的损失函数具有值.
Del*_*lip 13
@ Aleph7,Xavier/Glorot初始化取决于神经元的传入连接数(fan_in),传出连接数(fan_out)和激活函数(sigmoid或tanh)的种类.见:http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
那么现在,问你的问题.这就是我在TensorFlow中的表现:
(fan_in, fan_out) = ...
low = -4*np.sqrt(6.0/(fan_in + fan_out)) # use 4 for sigmoid, 1 for tanh activation
high = 4*np.sqrt(6.0/(fan_in + fan_out))
return tf.Variable(tf.random_uniform(shape, minval=low, maxval=high, dtype=tf.float32))
Run Code Online (Sandbox Code Playgroud)
请注意,我们应该从统一分布中抽样,而不是正常分布,如另一个答案中所建议的那样.
顺便说一句,我昨天写了一篇关于使用TensorFlow的不同内容的帖子,它恰好也使用了Xavier初始化.如果您有兴趣,还有一个带有端到端示例的python笔记本:https://github.com/delip/blog-stuff/blob/master/tensorflow_ufp.ipynb
一个很好的包装tensorflow
调用prettytensor
在源代码中给出了一个实现(直接从这里复制):
def xavier_init(n_inputs, n_outputs, uniform=True):
"""Set the parameter initialization using the method described.
This method is designed to keep the scale of the gradients roughly the same
in all layers.
Xavier Glorot and Yoshua Bengio (2010):
Understanding the difficulty of training deep feedforward neural
networks. International conference on artificial intelligence and
statistics.
Args:
n_inputs: The number of input nodes into each output.
n_outputs: The number of output nodes for each input.
uniform: If true use a uniform distribution, otherwise use a normal.
Returns:
An initializer.
"""
if uniform:
# 6 was used in the paper.
init_range = math.sqrt(6.0 / (n_inputs + n_outputs))
return tf.random_uniform_initializer(-init_range, init_range)
else:
# 3 gives us approximately the same limits as above since this repicks
# values greater than 2 standard deviations from the mean.
stddev = math.sqrt(3.0 / (n_inputs + n_outputs))
return tf.truncated_normal_initializer(stddev=stddev)
Run Code Online (Sandbox Code Playgroud)
TF-contrib有xavier_initializer
.以下是如何使用它的示例:
import tensorflow as tf
a = tf.get_variable("a", shape=[4, 4], initializer=tf.contrib.layers.xavier_initializer())
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print sess.run(a)
Run Code Online (Sandbox Code Playgroud)
除此之外,tensorflow还有其他初始化器:
In Tensorflow 2.0 and further both tf.contrib.*
and tf.get_variable()
are deprecated. In order to do Xavier initialization you now have to switch to:
init = tf.initializers.GlorotUniform()
var = tf.Variable(init(shape=shape))
# or a oneliner with a little confusing brackets
var = tf.Variable(tf.initializers.GlorotUniform()(shape=shape))
Run Code Online (Sandbox Code Playgroud)
Glorot uniform and Xavier uniform are two different names of the same initialization type. If you want to know more about how to use initializations in TF2.0 with or without Keras refer to documentation.
归档时间: |
|
查看次数: |
82754 次 |
最近记录: |