我已经使用tensorflow框架编写了一个算法并且遇到了问题,tf.train.Optimizer.compute_gradients(loss)所有权重都返回零.另一个问题是,如果我将批量大小大于约5,则tf.histogram_summary权重会抛出一些值为NaN的错误.
我不能在这里提供一个可重复的例子,因为我的代码非常笨重,而且我在TF中不太好,因为它更短.我会尝试在这里粘贴一些片段.
主循环:
images_ph = tf.placeholder(tf.float32, shape=some_shape)
labels_ph = tf.placeholder(tf.float32, shape=some_shape)
output = inference(BATCH_SIZE, images_ph)
loss = loss(labels_ph, output)
train_op = train(loss, global_step)
session = tf.Session()
session.run(tf.initialize_all_variables())
for i in xrange(MAX_STEPS):
images, labels = train_dataset.get_batch(BATCH_SIZE, yolo.INPUT_SIZE, yolo.OUTPUT_SIZE)
session.run([loss, train_op], feed_dict={images_ph : images, labels_ph : labels})
Run Code Online (Sandbox Code Playgroud)
Train_op(这是问题发生):
def train(total_loss)
opt = tf.train.AdamOptimizer()
grads = opt.compute_gradients(total_loss)
# Here gradients are zeros
for grad, var in grads:
if grad is not None:
tf.histogram_summary("gradients/" + var.op.name, grad)
return opt.apply_gradients(grads, global_step=global_step) …Run Code Online (Sandbox Code Playgroud)