相关疑难解决方法(0)

向TensorFlow添加多个图层会导致损失函数变为Nan

我正在TensorFlow/Python中为notMNIST数据集编写一个神经网络分类器.我已经在隐藏层上实现了l2正则化和丢失.只要只有一个隐藏层,它就可以正常工作,但是当我添加更多层(以提高准确性)时,损失函数在每一步都会迅速增加,在步骤5中变为NaN.我尝试暂时禁用Dropout和L2正则化,但是只要有2层以上,我就会得到相同的行为.我甚至从头开始重写我的代码(做一些重构以使其更灵活),但结果相同.层的数量和大小由hidden_layer_spec.我错过了什么？

#works for np.array([1024]) with about 96.1% accuracy
hidden_layer_spec = np.array([1024, 300])
num_hidden_layers = hidden_layer_spec.shape[0]
batch_size = 256
beta = 0.0005

epochs = 100
stepsPerEpoch = float(train_dataset.shape[0]) / batch_size
num_steps = int(math.ceil(float(epochs) * stepsPerEpoch))

l2Graph = tf.Graph()
with l2Graph.as_default():
  #with tf.device('/cpu:0'):
      # Input data. For the training data, we use a placeholder that will be fed
      # at run time with a training minibatch.
      tf_train_dataset = tf.placeholder(tf.float32,
                                        shape=(batch_size, image_size * image_size))
      tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels)) …

Run Code Online (Sandbox Code Playgroud)

python neural-network deep-learning tensorflow

Nim*_*and

lucky-day

13
推荐指数

2
解决办法

5225
查看次数

Tensorflow中的默认变量初始值设定项是什么？

在tf.get_variable()没有任何初始化程序规范的情况下调用时,使用的变量初始化的默认方法是什么？Docs只是说"没有".

python machine-learning deep-learning tensorflow

luo*_*h97

lucky-day

13
推荐指数

1
解决办法

1万
查看次数

损失减少但在张量流梯度下降期间权重似乎不会改变

我已经设置了一个非常简单的多层感知器,其中一个隐藏层使用了sigmoid传递函数,模拟数据有两个输入.

我试图在Github上使用TensorFlow示例使用简单前馈神经网络进行设置.我不会在这里发布整个内容,但我的成本函数设置如下:

# Backward propagation
loss = tensorflow.losses.mean_squared_error(labels=y, predictions=yhat)
cost = tensorflow.reduce_mean(loss, name='cost')
updates = tensorflow.train.GradientDescentOptimizer(0.01).minimize(cost)

Run Code Online (Sandbox Code Playgroud)

然后我简单地循环了一堆时代,意图是我的权重通过updates每一步的操作进行优化:

with tensorflow.Session() as sess:
    init = tensorflow.global_variables_initializer()
    sess.run(init)

    for epoch in range(10):

        # Train with each example
        for i in range(len(train_X)):
            feed_dict = {X: train_X[i: i + 1], y: train_y[i: i + 1]}

            res = sess.run([updates, loss], feed_dict)

            print "epoch {}, step {}. w_1: {}, loss: {}".format(epoch, i, w_1.eval(), res[1])

        train_result = sess.run(predict, feed_dict={X: train_X, y: train_y}) …

Run Code Online (Sandbox Code Playgroud)

python tensorflow

qua*_*ant

2018 04-07

6
推荐指数

1
解决办法

326
查看次数