为什么在测试模式期间在 tf.keras.layers.Dropout 中设置 training=True 会导致较低的训练损失值和较高的预测精度？

Question

为什么在测试模式期间在 tf.keras.layers.Dropout 中设置 training=True 会导致较低的训练损失值和较高的预测精度？

khe*_*edi 2 tensorflow dropout tensorflow2.0

我在 tensorflow (tf.keras.layers.Dropout) 中实现的模型上使用 dropout 层。我在训练期间设置了“training=True”，在测试时设置了“training=False”。性能很差。我也在测试期间不小心更改了“training=True”，结果变得更好了。我想知道发生了什么？为什么它会影响训练损失值？因为我不会对培训进行任何更改，并且整个测试过程都在培训之后进行。但是，在测试中改变“training=True”会影响训练过程，导致训练损失接近于零，然后测试结果会更好。任何可能的解释？

谢谢，

Answer 1

PKl*_*mpp 7

抱歉回复晚了，但 Celius 的回答并不完全正确。

Dropout层（以及BatchNormalization层）的训练参数定义了该层应该在训练模式还是推理模式下运行。您可以在官方文档中阅读此内容。

However, the documentation is a bit unclear on how this affects the execution of your network. Setting training=False does not mean that the Dropout layer is not part of your network. It is by no means ignored as Celius explained, but it just behaves in inference mode. For Dropout, this means that no dropout will be applied. For BN, it means that BN will use the parameters estimated during training instead of computing new parameters for every mini-batch. This is really. The other way around, if you set training=True, the layer will behave in training mode and dropout will be applied.

Now to your question: The behavior of your network does not make sense. If dropout was applied on unseen data, there is nothing to learn from that. You only throw away information, hence your results should be worse. But I think your problem is not related to the Dropout layer anyway. Does your network also make use of BatchNormalization layers? If BN is applied in a poor way, it can mess up your final results. But I haven't seen any code, so it is hard to fully answer your question as is.

感谢您的回答，为了避免混淆，删除了我的答案。 (3认同)

归档时间：	6 年，3 月前
查看次数：	3263 次
最近记录：	5 年，10 月前