如何在 Tensorflow 2 LSTM 训练中屏蔽多输出？

Question

如何在 Tensorflow 2 LSTM 训练中屏蔽多输出？

j s*_*sad 2 python numpy lstm keras tensorflow

我正在 Tensorflow 2 中训练 LSTM 模型来预测两个输出：水流和水温。

对于某些时间步长，有一个流标签和一个温度标签，
对于某些只有流量标签或温度标签，
对某些人来说，两者都没有。

因此，当温度和流量损失没有标签时，损失函数需要忽略它们。我已经阅读了大量 TF 文档，但我正在努力弄清楚如何最好地做到这一点。

到目前为止我已经尝试过

sample_weight_mode='temporal'在编译模型时指定，然后sample_weight在调用时包含一个 numpy 数组fit

当我这样做时，我收到一个错误，要求我传递一个二维数组。但这让我感到困惑，因为有 3 个维度：n_samples、sequence_length和n_outputs。

这是我基本上想做的一些代码：

import tensorflow as tf
import numpy as np

# set up the model
simple_lstm_model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(8, return_sequences=True),
    tf.keras.layers.Dense(2)
])

simple_lstm_model.compile(optimizer='adam', loss='mae',
                          sample_weight_mode='temporal')

n_sample = 2
seq_len = 10
n_feat = 5
n_out = 2

# random in/out
x = np.random.randn(n_sample, seq_len, n_feat)
y_true = np.random.randn(n_sample, seq_len, n_out)

# set the initial mask as all ones (everything counts equally)
mask = np.ones([n_sample, seq_len, n_out])
# set the mask so that in the 0th sample, in the 3-8th time step
# the 1th variable is not counted in the loss function
mask[0, 3:8, 1] = 0

simple_lstm_model.fit(x, y_true, sample_weight=mask)

Run Code Online (Sandbox Code Playgroud)

错误：

ValueError: Found a sample_weight array with shape (2, 10, 2). In order to use timestep-wise sample weighting, you should
pass a 2D sample_weight array.

Run Code Online (Sandbox Code Playgroud)

有任何想法吗？我一定不明白做什么，因为对我来说，只有当数组与输出具有相同的维度sample_weights时才有意义。sample_weight我可以编写一个自定义损失函数并手动处理屏蔽，但似乎应该有一个更通用或内置的解决方案。

Answer 1

Szy*_*zke 5

1.`sample_weights`

是的，你理解错误。在这种情况下，您有2样本、10时间步长以及5每个特征。您可以2D像这样传递张量，因此每个样本的每个时间步对总损失的贡献不同，所有特征的权重相同（通常是这种情况）。

那根本不是你想要的。您希望在计算后屏蔽某些损失值，这样它们就不会产生影响。

2. 海关损失

一种可能的解决方案是实现您自己的损失函数，该函数在采用mean或之前将损失张量乘以掩码sum之前将损失张量乘以掩码。

基本上，您以某种方式传递mask并tensor连接在一起，然后将其拆分到函数中以供使用。这就足够了：

def my_loss_function(y_true_mask, y_pred):
    # Recover y and mask
    y_true, mask = tf.split(y_true_mask, 2)
    # You could user reduce_sum or other combinations
    return tf.math.reduce_mean(tf.math.abs(y_true - y_pred) * mask)

Run Code Online (Sandbox Code Playgroud)

现在你的代码（没有加权，因为不需要）：

simple_lstm_model = tf.keras.models.Sequential(
    [tf.keras.layers.LSTM(8, return_sequences=True), tf.keras.layers.Dense(2)]
)

simple_lstm_model.compile(optimizer="adam", loss=my_loss_function)

n_sample = 2
seq_len = 10
n_feat = 5
n_out = 2

x = np.random.randn(n_sample, seq_len, n_feat)
y_true = np.random.randn(n_sample, seq_len, n_out)

mask = np.ones([n_sample, seq_len, n_out])
mask[0, 3:8, 1] = 0

# Stack y and mask together
y_true_mask = np.stack([y_true, mask])

simple_lstm_model.fit(x, y_true_mask)

Run Code Online (Sandbox Code Playgroud)

所以它有效。您还可以以其他方式堆叠这些值，但我希望您能感受到如何做到这一点。

3. 屏蔽输出

请注意上面介绍的一些问题。如果你有很多零并采取，mean你可能会得到一个非常小的损失值并抑制学习。另一方面，如果你跟着sum它可能会爆炸。

归档时间：	5 年，9 月前
查看次数：	713 次
最近记录：	5 年，9 月前

如何在 Tensorflow 2 LSTM 训练中屏蔽多输出？

1.sample_weights

2. 海关损失

3. 屏蔽输出

1.`sample_weights`