Tensorflow:如何设置对数刻度和一些Tensorflow问题的学习率

Question

Tensorflow:如何设置对数刻度和一些Tensorflow问题的学习率

che*_*chi 8 python deep-learning tensorflow deep-residual-networks

我是一个深度学习和Tensorflow初学者,我正在尝试使用Tensorflow 在本文中实现该算法.本文使用Matconvnet + Matlab来实现它,我很好奇Tensorflow是否具有实现相同功能的等效功能.该报说:

使用Xavier方法[14]初始化网络参数.我们使用了在l2惩罚下的四个小波子带的回归损失,并且使用随机梯度下降(SGD)训练所提出的网络.正则化参数(λ)为0.0001,动量为0.9.学习率设定为10-1至10-4,在每个时期以对数标度减少.

本文使用小波变换(WT)和残差学习方法(其中残差图像= WT(HR)-WT(HR'),并且HR'用于训练).Xavier方法建议用变量初始化变量正态分布

stddev=sqrt(2/(filter_size*filter_size*num_filters)

Run Code Online (Sandbox Code Playgroud)

Q1.我该如何初始化变量？以下代码是否正确？

weights = tf.Variable(tf.random_normal[img_size, img_size, 1, num_filters], stddev=stddev)

Run Code Online (Sandbox Code Playgroud)

本文没有详细说明如何构造损失函数.我无法找到等效的Tensorflow功能来设置日志范围内的学习率(仅限exponential_decay).我理解MomentumOptimizer相当于具有动量的随机梯度下降.

Q2:是否可以设置对数刻度的学习率？

Q3:如何创建上述损失函数？

我按照这个网站编写了下面的代码.假设model()函数返回本文中提到的网络并且lamda = 0.0001,

inputs = tf.placeholder(tf.float32, shape=[None, patch_size, patch_size, num_channels])
labels = tf.placeholder(tf.float32, [None, patch_size, patch_size, num_channels])

# get the model output and weights for each conv
pred, weights = model()

# define loss function
loss = tf.nn.softmax_cross_entropy_with_logits_v2(labels=labels, logits=pred)

for weight in weights:
    regularizers += tf.nn.l2_loss(weight)

loss = tf.reduce_mean(loss + 0.0001 * regularizers)

learning_rate = tf.train.exponential_decay(???) # Not sure if we can have custom learning rate for log scale
optimizer = tf.train.MomentumOptimizer(learning_rate, momentum).minimize(loss, global_step)

Run Code Online (Sandbox Code Playgroud)

注意:因为我是一个深度学习/ Tensorflow初学者,我在这里和那里复制粘贴代码所以如果可以,请随时纠正它;)

Answer 1

dgu*_*umo 2

其他答案非常详细且有帮助。下面是一个代码示例，它使用占位符以对数尺度衰减学习率。HTH。

import tensorflow as tf

import numpy as np


# data simulation
N = 10000
D = 10
x = np.random.rand(N, D)
w = np.random.rand(D,1)
y = np.dot(x, w)

print y.shape

#modeling
batch_size = 100
tni = tf.truncated_normal_initializer()
X = tf.placeholder(tf.float32, [batch_size, D])
Y = tf.placeholder(tf.float32, [batch_size,1])
W = tf.get_variable("w", shape=[D,1], initializer=tni)
B = tf.zeros([1])

lr = tf.placeholder(tf.float32)

pred = tf.add(tf.matmul(X,W), B)
print pred.shape
mse = tf.reduce_sum(tf.losses.mean_squared_error(Y, pred))
opt = tf.train.MomentumOptimizer(lr, 0.9)

train_op = opt.minimize(mse)

learning_rate = 0.0001

do_train = True
acc_err = 0.0
sess = tf.Session()
sess.run(tf.global_variables_initializer())
while do_train:
  for i in range (100000):
     if i > 0 and i % N == 0:
       # epoch done, decrease learning rate by 2
       learning_rate /= 2
       print "Epoch completed. LR =", learning_rate

     idx = i/batch_size + i%batch_size
     f = {X:x[idx:idx+batch_size,:], Y:y[idx:idx+batch_size,:], lr: learning_rate}
     _, err = sess.run([train_op, mse], feed_dict = f)
     acc_err += err
     if i%5000 == 0:
       print "Average error = {}".format(acc_err/5000)
       acc_err = 0.0

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，11 月前
查看次数：	1611 次
最近记录：	7 年，11 月前