相关疑难解决方法(0)

仅更新Tensorflow中单词嵌入矩阵的一部分

假设我想在训练期间更新预训练的字嵌入矩阵,有没有办法只更新字嵌入矩阵的子集?

我查看了Tensorflow API页面,发现了这个:

# Create an optimizer.
opt = GradientDescentOptimizer(learning_rate=0.1)

# Compute the gradients for a list of variables.
grads_and_vars = opt.compute_gradients(loss, <list of variables>)

# grads_and_vars is a list of tuples (gradient, variable).  Do whatever you
# need to the 'gradient' part, for example cap them, etc.
capped_grads_and_vars = [(MyCapper(gv[0]), gv[1])) for gv in grads_and_vars]

# Ask the optimizer to apply the capped gradients.
opt.apply_gradients(capped_grads_and_vars)
Run Code Online (Sandbox Code Playgroud)

但是,我如何将其应用于字嵌入矩阵.假设我这样做:

word_emb = tf.Variable(0.2 * tf.random_uniform([syn0.shape[0],s['es']], minval=-1.0, maxval=1.0, dtype=tf.float32),name='word_emb',trainable=False)

gather_emb = tf.gather(word_emb,indices) #assuming that …
Run Code Online (Sandbox Code Playgroud)

tensorflow word-embedding

10
推荐指数
2
解决办法
3646
查看次数

变量切片返回梯度 None

我一直在使用tf.gradients()函数并遇到了我意想不到的行为。也就是说,它似乎无法计算切片变量的梯度。我举了一个例子,希望能说明我的意思:

import tensorflow as tf

a = tf.Variable([1.0])
b = tf.Variable([1.0])
c = tf.concat(0, [a, b])
print(c)  # >Tensor("concat:0", shape=(2,), dtype=float32)

grad_full = tf.gradients(c,  c)
grad_slice1 = tf.gradients(c,  a)
grad_slice2 = tf.gradients(c,  c[:, ])  # --> Here the gradient is None
grad_slice3 = tf.gradients(c,  c[0, ])  # --> Here the gradient is None

print(grad_full)  # >[<tf.Tensor 'gradients/Fill:0' shape=(2,) dtype=float32>]
print(grad_slice1)  # >[<tf.Tensor 'gradients_1/concat_grad/Slice:0' shape=(1,) dtype=float32>]
print(grad_slice2)  # >[None]
print(grad_slice3)  # >[None]

sess = tf.Session()
sess.run(tf.initialize_all_variables())

grad_full_v, grad_slice_v = …
Run Code Online (Sandbox Code Playgroud)

python deep-learning tensorflow

6
推荐指数
1
解决办法
1913
查看次数