Jak*_*kob 21 time-series prediction lstm tensorflow
我目前正在尝试构建一个用于预测时间序列的简单模型.目标是使用序列训练模型,以便模型能够预测未来值.
我正在使用tensorflow和lstm单元格来执行此操作.该模型通过时间截断反向传播进行训练.我的问题是如何构建培训数据.
例如,假设我们想要学习给定的序列:
[1,2,3,4,5,6,7,8,9,10,11,...]
Run Code Online (Sandbox Code Playgroud)
我们将网络展开num_steps=4.
选项1
input data label
1,2,3,4 2,3,4,5
5,6,7,8 6,7,8,9
9,10,11,12 10,11,12,13
...
Run Code Online (Sandbox Code Playgroud)
选项2
input data label
1,2,3,4 2,3,4,5
2,3,4,5 3,4,5,6
3,4,5,6 4,5,6,7
...
Run Code Online (Sandbox Code Playgroud)
选项3
input data label
1,2,3,4 5
2,3,4,5 6
3,4,5,6 7
...
Run Code Online (Sandbox Code Playgroud)
选项4
input data label
1,2,3,4 5
5,6,7,8 9
9,10,11,12 13
...
Run Code Online (Sandbox Code Playgroud)
任何帮助,将不胜感激.
我正准备在TensorFlow中学习LSTM并试图实现一个例子(幸运的是)试图通过一个简单的数学函数来预测一些时间序列/数字序列.
但是我使用不同的方式来构建训练数据,这是由使用LSTM的无监督学习视频表示所激发的:
选项5:
input data label
1,2,3,4 5,6,7,8
2,3,4,5 6,7,8,9
3,4,5,6 7,8,9,10
...
Run Code Online (Sandbox Code Playgroud)
除了本文之外,我(尝试)通过给定的TensorFlow RNN示例获取灵感.我目前的完整解决方案如下所示:
import math
import random
import numpy as np
import tensorflow as tf
LSTM_SIZE = 64
LSTM_LAYERS = 2
BATCH_SIZE = 16
NUM_T_STEPS = 4
MAX_STEPS = 1000
LAMBDA_REG = 5e-4
def ground_truth_func(i, j, t):
return i * math.pow(t, 2) + j
def get_batch(batch_size):
seq = np.zeros([batch_size, NUM_T_STEPS, 1], dtype=np.float32)
tgt = np.zeros([batch_size, NUM_T_STEPS], dtype=np.float32)
for b in xrange(batch_size):
i = float(random.randint(-25, 25))
j = float(random.randint(-100, 100))
for t in xrange(NUM_T_STEPS):
value = ground_truth_func(i, j, t)
seq[b, t, 0] = value
for t in xrange(NUM_T_STEPS):
tgt[b, t] = ground_truth_func(i, j, t + NUM_T_STEPS)
return seq, tgt
# Placeholder for the inputs in a given iteration
sequence = tf.placeholder(tf.float32, [BATCH_SIZE, NUM_T_STEPS, 1])
target = tf.placeholder(tf.float32, [BATCH_SIZE, NUM_T_STEPS])
fc1_weight = tf.get_variable('w1', [LSTM_SIZE, 1], initializer=tf.random_normal_initializer(mean=0.0, stddev=1.0))
fc1_bias = tf.get_variable('b1', [1], initializer=tf.constant_initializer(0.1))
# ENCODER
with tf.variable_scope('ENC_LSTM'):
lstm = tf.nn.rnn_cell.LSTMCell(LSTM_SIZE)
multi_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm] * LSTM_LAYERS)
initial_state = multi_lstm.zero_state(BATCH_SIZE, tf.float32)
state = initial_state
for t_step in xrange(NUM_T_STEPS):
if t_step > 0:
tf.get_variable_scope().reuse_variables()
# state value is updated after processing each batch of sequences
output, state = multi_lstm(sequence[:, t_step, :], state)
learned_representation = state
# DECODER
with tf.variable_scope('DEC_LSTM'):
lstm = tf.nn.rnn_cell.LSTMCell(LSTM_SIZE)
multi_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm] * LSTM_LAYERS)
state = learned_representation
logits_stacked = None
loss = 0.0
for t_step in xrange(NUM_T_STEPS):
if t_step > 0:
tf.get_variable_scope().reuse_variables()
# state value is updated after processing each batch of sequences
output, state = multi_lstm(sequence[:, t_step, :], state)
# output can be used to make next number prediction
logits = tf.matmul(output, fc1_weight) + fc1_bias
if logits_stacked is None:
logits_stacked = logits
else:
logits_stacked = tf.concat(1, [logits_stacked, logits])
loss += tf.reduce_sum(tf.square(logits - target[:, t_step])) / BATCH_SIZE
reg_loss = loss + LAMBDA_REG * (tf.nn.l2_loss(fc1_weight) + tf.nn.l2_loss(fc1_bias))
train = tf.train.AdamOptimizer().minimize(reg_loss)
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
total_loss = 0.0
for step in xrange(MAX_STEPS):
seq_batch, target_batch = get_batch(BATCH_SIZE)
feed = {sequence: seq_batch, target: target_batch}
_, current_loss = sess.run([train, reg_loss], feed)
if step % 10 == 0:
print("@{}: {}".format(step, current_loss))
total_loss += current_loss
print('Total loss:', total_loss)
print('### SIMPLE EVAL: ###')
seq_batch, target_batch = get_batch(BATCH_SIZE)
feed = {sequence: seq_batch, target: target_batch}
prediction = sess.run([logits_stacked], feed)
for b in xrange(BATCH_SIZE):
print("{} -> {})".format(str(seq_batch[b, :, 0]), target_batch[b, :]))
print(" `-> Prediction: {}".format(prediction[0][b]))
Run Code Online (Sandbox Code Playgroud)
此示例输出如下所示:
### SIMPLE EVAL: ###
# [input seq] -> [target prediction]
# `-> Prediction: [model prediction]
[ 33. 53. 113. 213.] -> [ 353. 533. 753. 1013.])
`-> Prediction: [ 19.74548721 28.3149128 33.11489105 35.06603241]
[ -17. -32. -77. -152.] -> [-257. -392. -557. -752.])
`-> Prediction: [-16.38951683 -24.3657589 -29.49801064 -31.58583832]
[ -7. -4. 5. 20.] -> [ 41. 68. 101. 140.])
`-> Prediction: [ 14.14126873 22.74848557 31.29668617 36.73633194]
...
Run Code Online (Sandbox Code Playgroud)
该模型是LSTM自动编码器,每个都有2层.
不幸的是,正如您在结果中看到的那样,此模型无法正确学习序列.我可能就是这样,我只是在某个地方犯了一个错误的错误,或者1000-10000的训练步骤对于LSTM来说只是少数几个.正如我所说,我也刚刚开始正确理解/使用LSTM.但希望这可以为您提供有关实施的一些启发.
在阅读了几篇 LSTM 介绍博客(例如Jakob Aungiers 的)后,选项 3 似乎是无状态 LSTM 的正确选择。
如果您的 LSTM 需要记住比您更早的数据num_steps,您可以以有状态的方式进行训练 - 有关 Keras 示例,请参阅Philippe Remy 的博客文章“Keras 中的 Stateful LSTM”。然而,Philippe 没有展示批量大小大于 1 的示例。我猜想在您的情况下,带有状态 LSTM 的批量大小为 4 可以与以下数据一起使用(写为input -> label):
batch #0:
1,2,3,4 -> 5
2,3,4,5 -> 6
3,4,5,6 -> 7
4,5,6,7 -> 8
batch #1:
5,6,7,8 -> 9
6,7,8,9 -> 10
7,8,9,10 -> 11
8,9,10,11 -> 12
batch #2:
9,10,11,12 -> 13
...
Run Code Online (Sandbox Code Playgroud)
由此,例如批次#0中的第二个样本的状态被正确地重用以继续使用批次#1的第二个样本进行训练。
这在某种程度上类似于您的选项 4,但是您没有使用那里的所有可用标签。
更新:
作为我的建议的延伸,其中batch_size等于num_steps,Alexis Huet给出了作为 的除数的情况的答案,它可用于更大的。他在他的博客上对此进行了很好的描述。batch_sizenum_stepsnum_steps
| 归档时间: |
|
| 查看次数: |
11540 次 |
| 最近记录: |