Blu*_*ngo 6 python keras tensorflow attention-model seq2seq
我为文本摘要创建了一个 Seq2Seq 模型。我有两种模型,一种有注意力,一种没有。没有注意力的人能够产生预测,但我不能为有注意力的人做预测,即使它成功地拟合。
这是我的模型:
latent_dim = 300
embedding_dim = 200
clear_session()
# Encoder
encoder_inputs = Input(shape=(max_text_len, ))
# Embedding layer
enc_emb = Embedding(x_voc, embedding_dim,
trainable=True)(encoder_inputs)
# Encoder LSTM 1
encoder_lstm1 = Bidirectional(LSTM(latent_dim, return_sequences=True,
return_state=True, dropout=0.4,
recurrent_dropout=0.4))
(encoder_output1, forward_h1, forward_c1, backward_h1, backward_c1) = encoder_lstm1(enc_emb)
# Encoder LSTM 2
encoder_lstm2 = Bidirectional(LSTM(latent_dim, return_sequences=True,
return_state=True, dropout=0.4,
recurrent_dropout=0.4))
(encoder_output2, forward_h2, forward_c2, backward_h2, backward_c2) = encoder_lstm2(encoder_output1)
# Encoder LSTM 3
encoder_lstm3 = Bidirectional(LSTM(latent_dim, return_state=True,
return_sequences=True, dropout=0.4,
recurrent_dropout=0.4))
(encoder_outputs, forward_h, forward_c, backward_h, backward_c) = encoder_lstm3(encoder_output2)
state_h = Concatenate()([forward_h, backward_h])
state_c = Concatenate()([forward_c, backward_c])
# Set up the decoder, using encoder_states as the initial state
decoder_inputs = Input(shape=(None, ))
# Embedding layer
dec_emb_layer = Embedding(y_voc, embedding_dim, trainable=True)
dec_emb = dec_emb_layer(decoder_inputs)
# Decoder LSTM
decoder_lstm = LSTM(latent_dim*2, return_sequences=True,
return_state=True, dropout=0.4,
recurrent_dropout=0.2)
(decoder_outputs, decoder_fwd_state, decoder_back_state) = \
decoder_lstm(dec_emb, initial_state=[state_h, state_c])
#start Attention part
attention = dot([decoder_outputs, encoder_outputs], axes=[2, 2])
attention = Activation('softmax')(attention)
context = dot([attention, encoder_outputs], axes=[2,1])
decoder_outputs = Concatenate()([context, decoder_outputs])
#end Attention
# Dense layer
decoder_dense = TimeDistributed(Dense(y_voc, activation='softmax'))(decoder_outputs)
# Define the model
model = Model([encoder_inputs, decoder_inputs], decoder_dense)
Run Code Online (Sandbox Code Playgroud)
这是构建用于生成预测的编码器和解码器的方法:
model = load_model("model_intro.h5")
encoder_inputs = model.input[0] # input_1
encoder_outputs, forward_h, forward_c, backward_h, backward_c = model.layers[5].output #Bi-lstm2
state_h_enc = Concatenate()([forward_h, backward_h])
state_c_enc = Concatenate()([forward_c, backward_c])
encoder_states = [state_h_enc, state_c_enc]
encoder_model = Model(encoder_inputs, encoder_states)
decoder_inputs = model.input[1] # input_2
decoder_state_input_h = Input(shape=(latent_dim*2,), name="input_3")
decoder_state_input_c = Input(shape=(latent_dim*2,), name="input_4")
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_emdedding = model.layers[6](decoder_inputs)
decoder_lstm = model.layers[9]
decoder_outputs, state_h_dec, state_c_dec = decoder_lstm(decoder_emdedding, initial_state=decoder_states_inputs)
decoder_states = [state_h_dec, state_c_dec]
#start Attention
attention = dot([decoder_outputs, encoder_outputs], axes=[2, 2])
attention2 = Activation('softmax')(attention)
context = dot([attention2, encoder_outputs], axes=[2,1])
decoder_outputs = Concatenate(axis=-1)([context, decoder_outputs])
#end Attention
decoder_dense = model.layers[-1]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model(
[decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states
)
Run Code Online (Sandbox Code Playgroud)
在代码中,如果我删除了注意部分,它工作正常。在代码中,我添加了注意力开始和结束的注释。具有注意力的模型也成功拟合,但是,在构建用于生成预测的编码器和解码器时,我得到:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 300), dtype=tf.float32, name='input_1'), name='input_1', description="created by layer 'input_1'") at layer "embedding". The following previous layers were accessed without issue: []
Run Code Online (Sandbox Code Playgroud)
我有类似的问题。我的模型训练得很好,但在为 seq2seq 架构设置推理时,我无法理解为什么嵌入层会导致问题。这是模型架构:
代码 :
from keras.models import load_model, Model
from keras.layers import Input, Embedding
LATENT_DIM = 128
EMBEDDING_DIM = 200
num_words_output = 2000
model = load_model('s2s')
model.summary()
encoder_inputs = model.input[0] # input_1
encoder_outputs, state_h_enc, state_c_enc = model.layers[4].output # lstm_1
encoder_states = [state_h_enc, state_c_enc]
encoder_model = Model(encoder_inputs, encoder_states)
decoder_inputs = model.input[1] # input_2
decoder_state_input_h = Input(shape=(LATENT_DIM,))
decoder_state_input_c = Input(shape=(LATENT_DIM,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_lstm = model.layers[5]
decoder_outputs, state_h_dec, state_c_dec = model.layers[5].output # lstm_2
decoder_states = [state_h_dec, state_c_dec]
decoder_dense = model.layers[6]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)
Run Code Online (Sandbox Code Playgroud)
错误:
第 29 行:decoder_model = Model([decoder_inputs] +decoder_states_inputs, [decoder_outputs] +decoder_states)
ValueError:图形已断开:无法获取张量 KerasTensor 的值(type_spec=TensorSpec(shape=(None,16),dtype=tf.float32,name='input_1'),name='input_1',description=“由层创建” input_1'") 在“嵌入”层。访问以下先前层没有问题:[]
| 归档时间: |
|
| 查看次数: |
157 次 |
| 最近记录: |