ValueError:数据基数不明确

Ars*_*ray 7 python lstm keras tensorflow

我正在尝试根据从 DataFrame 获取的数据训练 LSTM 网络。

这是代码:

x_lstm=x.to_numpy().reshape(1,x.shape[0],x.shape[1])

model = keras.models.Sequential([
    keras.layers.LSTM(x.shape[1], return_sequences=True, input_shape=(x_lstm.shape[1],x_lstm.shape[2])),
    keras.layers.LSTM(NORMAL_LAYER_SIZE, return_sequences=True),
    keras.layers.LSTM(NORMAL_LAYER_SIZE),
    keras.layers.Dense(y.shape[1])
])

optimizer=keras.optimizers.Adadelta()

model.compile(loss="mse", optimizer=optimizer)
for i in range(150):
    history = model.fit(x_lstm, y)
    save_model(model,'tmp.rnn')
Run Code Online (Sandbox Code Playgroud)

这失败了

ValueError: Data cardinality is ambiguous:
  x sizes: 1
  y sizes: 99
Please provide data which shares the same first dimension.
Run Code Online (Sandbox Code Playgroud)

当我将模型更改为

model = keras.models.Sequential([
    keras.layers.LSTM(x.shape[1], return_sequences=True, input_shape=x_lstm.shape),
    keras.layers.LSTM(NORMAL_LAYER_SIZE, return_sequences=True),
    keras.layers.LSTM(NORMAL_LAYER_SIZE),
    keras.layers.Dense(y.shape[1])
])
Run Code Online (Sandbox Code Playgroud)

它失败并出现以下错误:

Input 0 of layer lstm_9 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1, 99, 1200]
Run Code Online (Sandbox Code Playgroud)

我如何让这个工作?

x 的形状为(99, 1200)(99 个项目,每个项目有 1200 个特征,这只是一个更大的数据集样本),y 的形状(99, 1)

Ten*_*ior 9

正如Error所暗示的,First DimensionX而且y是不同的。First Dimension表示Batch Size和它应该是一样的。

请确保Y也有shape, (1, something)

我可以使用下面显示的代码重现您的错误:

from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
import tensorflow as tf
import numpy as np


# define sequences
sequences = [
    [1, 2, 3, 4],
       [1, 2, 3],
             [1]
    ]

# pad sequence
padded = pad_sequences(sequences)
X = np.expand_dims(padded, axis = 0)
print(X.shape) # (1, 3, 4)

y = np.array([1,0,1])
#y = y.reshape(1,-1)
print(y.shape) # (3,)

model = Sequential()
model.add(LSTM(4, return_sequences=False, input_shape=(None, X.shape[2])))
model.add(Dense(1, activation='sigmoid'))

model.compile (
    loss='mean_squared_error',
    optimizer=tf.keras.optimizers.Adam(0.001))

model.fit(x = X, y = y)
Run Code Online (Sandbox Code Playgroud)

如果我们遵守Print声明,

Shape of X is  (1, 3, 4)
Shape of y is (3,)
Run Code Online (Sandbox Code Playgroud)

这个错误可以通过取消注释行来修复,y = y.reshape(1,-1),这使得First Dimension( Batch_Size) 等于 ( 1)Xy

现在,工作代码和输出如下所示:

from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
import tensorflow as tf
import numpy as np


# define sequences
sequences = [
    [1, 2, 3, 4],
       [1, 2, 3],
             [1]
    ]

# pad sequence
padded = pad_sequences(sequences)
X = np.expand_dims(padded, axis = 0)
print('Shape of X is ', X.shape) # (1, 3, 4)

y = np.array([1,0,1])
y = y.reshape(1,-1)
print('Shape of y is', y.shape) # (1, 3)

model = Sequential()
model.add(LSTM(4, return_sequences=False, input_shape=(None, X.shape[2])))
model.add(Dense(1, activation='sigmoid'))

model.compile (
    loss='mean_squared_error',
    optimizer=tf.keras.optimizers.Adam(0.001))

model.fit(x = X, y = y)
Run Code Online (Sandbox Code Playgroud)

上面代码的输出是:

Shape of X is  (1, 3, 4)
Shape of y is (1, 3)
1/1 [==============================] - 0s 1ms/step - loss: 0.2588
<tensorflow.python.keras.callbacks.History at 0x7f5b0d78f4a8>
Run Code Online (Sandbox Code Playgroud)

希望这可以帮助。快乐学习!