我正在开发一个 Bi-LSTM 模型并想为其添加一个注意力层。但我不知道如何添加它。
我当前的模型代码是
model = Sequential()
model.add(Embedding(max_words, 1152, input_length=max_len, weights=[embeddings]))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Bidirectional(LSTM(32)))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.summary()
Run Code Online (Sandbox Code Playgroud)
模型摘要是
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 1152, 1152) 278396928
_________________________________________________________________
batch_normalization_1 (Batch (None, 1152, 1152) 4608
_________________________________________________________________
activation_1 (Activation) (None, 1152, 1152) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 1152, 1152) 0
_________________________________________________________________
bidirectional_1 (Bidirection (None, 64) 303360
_________________________________________________________________
batch_normalization_2 (Batch (None, 64) 256
_________________________________________________________________
activation_2 (Activation) (None, 64) 0
_________________________________________________________________
dropout_2 (Dropout) …
Run Code Online (Sandbox Code Playgroud) 使用以下代码:
model = Sequential()
num_features = data.shape[2]
num_samples = data.shape[1]
model.add(
LSTM(16, batch_input_shape=(None, num_samples, num_features), return_sequences=True, activation='tanh'))
model.add(PReLU())
model.add(Dropout(0.5))
model.add(LSTM(8, return_sequences=True, activation='tanh'))
model.add(Dropout(0.1))
model.add(PReLU())
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
Run Code Online (Sandbox Code Playgroud)
我试图了解如何在第一个LSTM层之前添加注意机制.我找到了以下GitHub:PhilippeRémy的keras-attention-mechanism,但无法弄清楚如何将它与我的代码一起使用.
我想想象注意机制,看看模型关注的功能是什么.
任何帮助将不胜感激,尤其是代码修改.谢谢 :)
我正在尝试为我的文本分类模型添加一个注意层。输入是文本(例如电影评论),输出是二元结果(例如正面与负面)。
model = Sequential()
model.add(Embedding(max_features, 32, input_length=maxlen))
model.add(Bidirectional(CuDNNGRU(16,return_sequences=True)))
##### add attention layer here #####
model.add(Dense(1, activation='sigmoid'))
Run Code Online (Sandbox Code Playgroud)
经过一番搜索,我发现了几个 keras 的可阅读使用的注意力层。keras.layers.Attention
Keras 中内置了该层。还有SeqWeightedAttention
SeqSelfAttention layer
keras-self-attention包中。作为一个深度学习领域的新手,我很难理解这些层背后的机制。
这些外行各有什么作用?哪一款最适合我的模型?
非常感谢!