seq-to-seq LSTM 在低频简单正弦波上的性能不佳

Question

seq-to-seq LSTM 在低频简单正弦波上的性能不佳

Hes*_*sam 5 python lstm tensorflow recurrent-neural-network

我正在尝试在简单的正弦波上训练seq-to-seq模型。目标是获取数据点Nin并预测Nout下一个数据点。任务看起来很简单，模型对大频率freq（y = sin(freq * x)）的预测很好。例如，对于freq=4，损失非常低，预测非常接近目标。然而，对于低频，预测是糟糕的。关于为什么模型失败的任何想法？

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, LSTM, RepeatVector, TimeDistributed, Dense

freq = 0.25
Nin, Nout = 14, 14

# Helper function to convert 1d data to (input, target) samples
def windowed_dataset(y, input_window = 5, output_window = 1, stride = 1, num_features = 1):
    L = y.shape[0]
    num_samples = (L - input_window - output_window) // stride + 1
    X = np.zeros([input_window, num_samples, num_features])
    Y = np.zeros([output_window, num_samples, num_features])    
    for ff in np.arange(num_features):
        for ii in np.arange(num_samples):
            start_x = stride * ii
            end_x = start_x + input_window
            X[:, ii, ff] = y[start_x:end_x, ff]
            start_y = stride * ii + input_window
            end_y = start_y + output_window 
            Y[:, ii, ff] = y[start_y:end_y, ff]
    return X, Y

# The input shape is your sequence length and your token embedding size
inputs = Input(shape=(Nin, 1))
# Build a RNN encoder
encoder = LSTM(128, return_sequences=False)(inputs)
# Repeat the encoding for every input to the decoder
encoding_repeat = RepeatVector(Nout)(encoder)
# Pass your (5, 128) encoding to the decoder
decoder = LSTM(128, return_sequences=True)(encoding_repeat)
# Output each timestep into a fully connected layer
sequence_prediction = TimeDistributed(Dense(1, activation='linear'))(decoder)
model = Model(inputs, sequence_prediction)
model.compile('adam', 'mse')  # Or categorical_crossentropy
y = np.sin(freq * np.linspace(0, 10, 1000))[:, None]
Ntr = int(0.8 * y.shape[0])
y_train, y_test = y[:Ntr], y[Ntr:]
from generate_dataset import *
stride = 1
N_features = 1
Xtrain, Ytrain = windowed_dataset(y_train, input_window=Nin, output_window=Nout, stride=stride,
                                  num_features=N_features)
print(model.summary())
Xtrain, Ytrain = Xtrain.transpose(1, 0, 2), Ytrain.transpose(1, 0, 2)
print("Xtrain", Xtrain.shape)
model.fit(Xtrain, Ytrain, epochs=30)
plt.figure(); plt.plot(y, 'ro')
for Ns in arr([10, 50, 200, 400, 800, 1500, 3000]) // 10:
    ypred = model.predict(Xtrain[[Ns]])
    print("ypred", ypred.shape)
    ypred = ypred[-1]
    plt.figure()
    plt.plot(ypred, 'ro')
    plt.plot(Xtrain[Ns], 'm--')
    plt.plot(Ytrain[Ns], 'k.')
    plt.show()
exit()

Run Code Online (Sandbox Code Playgroud)

Answer 1

Pet*_*ter 1

我认为因为你得到的越低，它得到的模式就越少。Ea 将其视为您在 X 输入上获得了一个模式来预测下一个输出。尽管 al x(n) 输入的值略有上升，但几乎没有任何模式。之前也发生了轻微的上涨，所以没有学到新的东西，也没有新的模式。当 x 波经过时，需要更长的训练时间才能将其视为一种模式。

但如果你接受相同数量的训练，那就很有趣了。但是在正弦线上向前跳，或者更容易，使用您良好的工作模型，然后使用分开的输入进行测试。ea：如果你用 5,10,15,20,25 等度数训练它。给训练过的网络 0.05 0.10 度（ea 仅改变输入，但保留网络）。

继续说，序列训练器网络在具有模式（如语言文本预测等）的数据上运行良好，但在具有少量模式的数据上运行不佳。

---编辑 --（评论回复太长）--
是的，调试神经网络很困难，尽管我认为你必须回到基本原理，上升信号是一种模式，只有在以下情况下才能检测到：它在训练中（足够）上下提升。Rnn 和 lstms 在串行模式和 asci 字符串中表现良好，对于测试用例而言，缓慢滑动的数字几乎不是在内存中引用的模式。也许你可以通过改变训练样本顺序来改进这里，所以在正弦波上取一个随机位置，因为内部“缩小/接近”误差校正可能会过度相信某个方向，导致最后 70 个样本上升，为什么 71下去。以便更好地处理它。

归档时间：	4 年，5 月前
查看次数：	111 次
最近记录：	4 年，5 月前