相关疑难解决方法(0)

如何使用 keras 构建注意力模型?

我正在尝试理解注意力模型并自己构建一个。经过多次搜索,我发现了这个网站,它有一个用 keras 编码的注意力模型,而且看起来也很简单。但是当我试图在我的机器上构建相同的模型时,它给出了多个参数错误。错误是由于传入 class 的参数不匹配Attention。在网站的注意力类中,它要求一个参数,但它用两个参数启动注意力对象。

import tensorflow as tf

max_len = 200
rnn_cell_size = 128
vocab_size=250

class Attention(tf.keras.Model):
    def __init__(self, units):
        super(Attention, self).__init__()
        self.W1 = tf.keras.layers.Dense(units)
        self.W2 = tf.keras.layers.Dense(units)
        self.V = tf.keras.layers.Dense(1)
    def call(self, features, hidden):
        hidden_with_time_axis = tf.expand_dims(hidden, 1)
        score = tf.nn.tanh(self.W1(features) + self.W2(hidden_with_time_axis))
        attention_weights = tf.nn.softmax(self.V(score), axis=1)
        context_vector = attention_weights * features
        context_vector = tf.reduce_sum(context_vector, axis=1)
        return context_vector, attention_weights

sequence_input = tf.keras.layers.Input(shape=(max_len,), dtype='int32')

embedded_sequences = tf.keras.layers.Embedding(vocab_size, 128, input_length=max_len)(sequence_input)

lstm = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM
                                     (rnn_cell_size,
                                      dropout=0.3,
                                      return_sequences=True, …
Run Code Online (Sandbox Code Playgroud)

python deep-learning keras tensorflow attention-model

14
推荐指数
2
解决办法
1万
查看次数

如何将注意力层添加到 Bi-LSTM

我正在开发一个 Bi-LSTM 模型并想为其添加一个注意力层。但我不知道如何添加它。

我当前的模型代码是

model = Sequential()
model.add(Embedding(max_words, 1152, input_length=max_len, weights=[embeddings]))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Bidirectional(LSTM(32)))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.summary()
Run Code Online (Sandbox Code Playgroud)

模型摘要是

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 1152, 1152)        278396928 
_________________________________________________________________
batch_normalization_1 (Batch (None, 1152, 1152)        4608      
_________________________________________________________________
activation_1 (Activation)    (None, 1152, 1152)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 1152, 1152)        0         
_________________________________________________________________
bidirectional_1 (Bidirection (None, 64)                303360    
_________________________________________________________________
batch_normalization_2 (Batch (None, 64)                256       
_________________________________________________________________
activation_2 (Activation)    (None, 64)                0         
_________________________________________________________________
dropout_2 (Dropout) …
Run Code Online (Sandbox Code Playgroud)

nlp machine-learning python-3.x keras tensorflow

14
推荐指数
1
解决办法
4521
查看次数