Abi*_*cov 6 lstm tensorflow gated-recurrent-unit
为什么GRU层的参数个数是9600?
不应该是 ((16+32)*32 + 32) * 3 * 2 = 9,408 吗?
或者,重新排列,
32*(16 + 32 + 1)*3*2 = 9408
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=4500, output_dim=16, input_length=200),
tf.keras.layers.Bidirectional(tf.keras.layers.GRU(32)),
tf.keras.layers.Dense(6, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary()
Run Code Online (Sandbox Code Playgroud)
关键是当参数reset_after=Truein时,tensorflow 将分离输入和循环内核的偏差GRUCell。你可以看一些的源代码在GRUCell如下:
if self.use_bias:
if not self.reset_after:
bias_shape = (3 * self.units,)
else:
# separate biases for input and recurrent kernels
# Note: the shape is intentionally different from CuDNNGRU biases
# `(2 * 3 * self.units,)`, so that we can distinguish the classes
# when loading and converting saved weights.
bias_shape = (2, 3 * self.units)
Run Code Online (Sandbox Code Playgroud)
但如果我们设置reset_after=True,实际公式如下:

如您所见, 的默认参数GRU是reset_after=True in tensorflow2。但是默认参数GRU是reset_after=Falsein tensorflow1.x。
所以一个GRU层的参数个数应该((16+32)*32 + 32 + 32) * 3 * 2 = 9600在tensorflow2.
| 归档时间: |
|
| 查看次数: |
2703 次 |
| 最近记录: |