形状不匹配:标签的形状(收到的(128,))应该等于对数的形状,除了最后一个维度(收到的(16,424))

Vin*_*yan 5 python keras tensorflow recurrent-neural-network

错误值错误:在转换后的代码中:

<ipython-input-63-1e3afece3370>:10 train_step  *
    loss += loss_func(targ, logits)
<ipython-input-43-44b2a8f6794e>:11 loss_func  *
    loss_ = loss_object(real, pred)
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/losses.py:124 __call__
    losses = self.call(y_true, y_pred)
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/losses.py:216 call
    return self.fn(y_true, y_pred, **self._fn_kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/losses.py:973 sparse_categorical_crossentropy
    y_true, y_pred, from_logits=from_logits, axis=axis)
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/backend.py:4431 sparse_categorical_crossentropy
    labels=target, logits=output)
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py:3477 sparse_softmax_cross_entropy_with_logits_v2
    labels=labels, logits=logits, name=name)
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py:3393 sparse_softmax_cross_entropy_with_logits
    logits.get_shape()))

ValueError: Shape mismatch: The shape of labels (received (128,)) should equal the shape of logits except for the last dimension (received (16, 424)).
Run Code Online (Sandbox Code Playgroud)

我参考的代码使用了 legacyseq2seq.sequence_loss_by_example,现在已弃用。所以我试过 SparseCategoricalCrossentropy 损失方法抛出同样的错误

模型(Keras)

    def build_model(training=True):
       input_ = tf.keras.layers.Input(shape=(unfold_max,), name='inputs')
       embedding = tf.keras.layers.Embedding(input_dim=n_items + 1,
                                      output_dim=128)(input_)
       cells=[]
       for _ in range(1):
           cells.append(
                tf.keras.layers.LSTMCell(128,
                              dropout=0.25,
                              activation=tf.keras.activations.tanh)
       )
       cell_output = tf.keras.layers.RNN(cells, return_state=False)(embedding)
       softmax_W = tf.Variable(tf.ones(shape=(128,1 + n_items)), 'softmax_w')
       softmax_b = tf.Variable(tf.zeros(shape=(n_items + 1)),'softmax_b')
       output = tf.reshape(cell_output, [-1, 128])
       logits = tf.matmul(output, softmax_W) + softmax_b
       return tf.keras.Model(inputs=[input_], outputs=[logits])
Run Code Online (Sandbox Code Playgroud)

损失函数

    optimizer = tf.keras.optimizers.Adam()
    loss_object = tf.keras.losses.SparseCategoricalCrossentropy(
                     from_logits=True,
                     reduction='none')

    def loss_func(real, pred):
        mask = tf.math.logical_not(tf.math.equal(real, 0))
        loss_ = loss_object(real, pred)
        mask = tf.cast(mask, dtype=loss_.dtype)
        loss_ *= mask   
        return tf.reduce_mean(loss_)
Run Code Online (Sandbox Code Playgroud)

训练步骤

    @tf.function
    def train_step(inp, targ):
        loss = 0
        with tf.GradientTape() as tape:
             logits = model(inp)
             loss += loss_func(targ, logits)

        variables = model.trainable_variables
        gradients = tape.gradient(loss, variables)
        optimizer.apply_gradients(zip(gradients, variables))
        return loss
Run Code Online (Sandbox Code Playgroud)

输入:[135, 144, 0, 0, 0, 0, 0, 0]

目标:[144, 127, 0, 0, 0, 0, 0, 0]