Aar*_*ack 5 time-series prediction neural-network deep-learning keras
我正在使用 LSTM NN 和 Keras 进行时间序列预测。作为输入特征,有两个变量(降水量和温度),要预测的一个目标是地下水位。
\n尽管实际数据和输出之间存在严重的偏移(见图),但它似乎工作得很好。
\n现在我读到这可能是网络无法正常工作的典型标志,因为它似乎在模仿输出并且
\n\n\n该模型实际上所做的是,当预测\n时间 \xe2\x80\x9ct+1\xe2\x80\x9d 的值时,它只是使用时间 \xe2\x80\x9ct\xe2\x80\x9d 的值作为它的预测https://towardsdatascience.com/how-not-to-use-machine-learning-for-time-series-forecasting-avoiding-the-pitfalls-19f9d7adf424
\n
然而,在我的例子中这实际上是不可能的,因为目标值不用作输入变量。我使用的是具有两个特征的多元时间序列,与输出特征无关。\n此外,预测值在未来 (t+1) 中不会偏移,而是似乎滞后于 (t-1)。
\n\n这是我的网络的完整代码:
\n# Split in Input and Output Data \nx_1 = data[[\'MeanT\']].values\nx_2 = data[[\'Precip\']].values\ny = data[[\'Z_424A_6857\']].values\n\n# Scale Data\nx = np.hstack([x_1, x_2])\nscaler = MinMaxScaler(feature_range=(0, 1))\nx = scaler.fit_transform(x)\n\nscaler_out = MinMaxScaler(feature_range=(0, 1))\ny = scaler_out.fit_transform(y)\n\n# Reshape Data\nx_1, x_2, y = H.create2feature_data(x_1, x_2, y, window)\ntrain_size = int(len(x_1) * .8)\ntest_size = int(len(x_1)) # * .5\n\nx_1 = np.expand_dims(x_1, 2) # 3D tensor with shape (batch_size, timesteps, input_dim) // (nr. of samples, nr. of timesteps, nr. of features)\nx_2 = np.expand_dims(x_2, 2)\ny = np.expand_dims(y, 1)\n\n# Split Training Data\nx_1_train = x_1[:train_size]\nx_2_train = x_2[:train_size]\ny_train = y[:train_size]\n\n# Split Test Data\nx_1_test = x_1[train_size:test_size]\nx_2_test = x_2[train_size:test_size]\ny_test = y[train_size:test_size]\n\n# Define Model Input Sets\ninputA = Input(shape=(window, 1))\ninputB = Input(shape=(window, 1))\n\n# Build Model Branch 1\nbranch_1 = layers.GRU(16, activation=act, dropout=0, return_sequences=False, stateful=False, batch_input_shape=(batch, 30, 1))(inputA)\nbranch_1 = layers.Dense(8, activation=act)(branch_1)\n#branch_1 = layers.Dropout(0.2)(branch_1)\nbranch_1 = Model(inputs=inputA, outputs=branch_1) \n\n# Build Model Branch 2\nbranch_2 = layers.GRU(16, activation=act, dropout=0, return_sequences=False, stateful=False, batch_input_shape=(batch, 30, 1))(inputB)\nbranch_2 = layers.Dense(8, activation=act)(branch_2)\n#branch_2 = layers.Dropout(0.2)(branch_2)\nbranch_2 = Model(inputs=inputB, outputs=branch_2) \n\n# Combine Model Branches\ncombined = layers.concatenate([branch_1.output, branch_2.output])\n \n# apply a FC layer and then a regression prediction on the combined outputs\ncomb = layers.Dense(6, activation=act)(combined)\ncomb = layers.Dense(1, activation="linear")(comb)\n \n# Accept the inputs of the two branches and then output a single value\nmodel = Model(inputs=[branch_1.input, branch_2.input], outputs=comb)\nmodel.compile(loss=\'mse\', optimizer=\'adam\', metrics=[\'mse\', H.r2_score])\n\nmodel.summary()\n\n# Training\nmodel.fit([x_1_train, x_2_train], y_train, epochs=epoch, batch_size=batch, validation_split=0.2, callbacks=[tensorboard])\nmodel.reset_states()\n\n# Evaluation\nprint(\'Train evaluation\')\nprint(model.evaluate([x_1_train, x_2_train], y_train))\n\nprint(\'Test evaluation\')\nprint(model.evaluate([x_1_test, x_2_test], y_test))\n\n# Predictions\npredictions_train = model.predict([x_1_train, x_2_train])\npredictions_test = model.predict([x_1_test, x_2_test])\n\npredictions_train = np.reshape(predictions_train, (-1,1))\npredictions_test = np.reshape(predictions_test, (-1,1))\n\n# Reverse Scaling\npredictions_train = scaler_out.inverse_transform(predictions_train)\npredictions_test = scaler_out.inverse_transform(predictions_test)\n\n# Plot results\nplt.figure(figsize=(15, 6))\nplt.plot(orig_data, color=\'blue\', label=\'True GWL\') \nplt.plot(range(train_size), predictions_train, color=\'red\', label=\'Predicted GWL (Training)\')\nplt.plot(range(train_size, test_size), predictions_test, color=\'green\', label=\'Predicted GWL (Test)\')\nplt.title(\'GWL Prediction\') \nplt.xlabel(\'Day\') \nplt.ylabel(\'GWL\') \nplt.legend() \nplt.show() \nRun Code Online (Sandbox Code Playgroud)\n我使用的批量大小为 30 个时间步长,回溯为 90 个时间步长,总数据大小约为 7500 个时间步长。
\n任何帮助将不胜感激:-) 谢谢!
\n| 归档时间: |
|
| 查看次数: |
3658 次 |
| 最近记录: |