@tf.function ValueError：在非第一次调用用 tf.function 修饰的函数时创建变量，无法理解行为

Question

@tf.function ValueError：在非第一次调用用 tf.function 修饰的函数时创建变量，无法理解行为

dro*_*ngo 5 python keras tensorflow tensorflow2.0

我想知道为什么这个功能：

@tf.function
def train(self,TargetNet,epsilon):
    if len(self.experience['s']) < self.min_experiences:
        return 0
    ids=np.random.randint(low=0,high=len(self.replay_buffer['s']),size=self.batch_size)
    states=np.asarray([self.experience['s'][i] for i in ids])
    actions=np.asarray([self.experience['a'][i] for i in ids])
    rewards=np.asarray([self.experience['r'][i] for i in ids])
    next_states=np.asarray([self.experience['s1'][i] for i in ids])
    dones = np.asarray([self.experience['done'][i] for i in ids])
    q_next_actions=self.get_action(next_states,epsilon)
    q_value_next=TargetNet.predict(next_states)
    q_value_next=tf.gather_nd(q_value_next,tf.stack((tf.range(self.batch_size),q_next_actions),axis=1))
    targets=tf.where(dones, rewards, rewards+self.gamma*q_value_next)

    with tf.GradientTape() as tape:
        estimates=tf.math.reduce_sum(self.predict(states)*tf.one_hot(actions,self.num_actions),axis=1)
        loss=tf.math.reduce_sum(tf.square(estimates - targets))
    variables=self.model.trainable_variables
    gradients=tape.gradient(loss,variables)
    self.optimizer.apply_gradients(zip(gradients,variables))

Run Code Online (Sandbox Code Playgroud)

给出 ValueError：在非第一次调用用 tf.function 修饰的函数时创建变量。而这段代码非常相似：

@tf.function
def train(self, TargetNet):
    if len(self.experience['s']) < self.min_experiences:
        return 0
    ids = np.random.randint(low=0, high=len(self.experience['s']), size=self.batch_size)
    states = np.asarray([self.experience['s'][i] for i in ids])
    actions = np.asarray([self.experience['a'][i] for i in ids])
    rewards = np.asarray([self.experience['r'][i] for i in ids])
    states_next = np.asarray([self.experience['s2'][i] for i in ids])
    dones = np.asarray([self.experience['done'][i] for i in ids])
    value_next = np.max(TargetNet.predict(states_next), axis=1)
    actual_values = np.where(dones, rewards, rewards+self.gamma*value_next)

    with tf.GradientTape() as tape:
        selected_action_values = tf.math.reduce_sum(
            self.predict(states) * tf.one_hot(actions, self.num_actions), axis=1)
        loss = tf.math.reduce_sum(tf.square(actual_values - selected_action_values))
    variables = self.model.trainable_variables
    gradients = tape.gradient(loss, variables)
    self.optimizer.apply_gradients(zip(gradients, variables))

Run Code Online (Sandbox Code Playgroud)

不会抛出错误。请帮助我理解原因。

编辑：我从函数中删除了参数 epsilon 并且它有效。是因为 @tf.function 装饰器仅对单参数函数有效吗？

Answer 1

nes*_*uno 8

使用 tf.function 您正在转换装饰函数的内容：这意味着 TensorFlow 将尝试将您的热切代码编译为其图形表示。

但是，变量是特殊对象。事实上，当您使用 TensorFlow 1.x（图形模式）时，您只定义了一次变量，然后使用/更新它们。

在 tensorflow 2.0 中，如果您使用纯急切执行，您可以多次声明和重用同一个变量，因为 a tf.Variable- 在急切模式下 - 只是一个简单的 Python 对象，一旦函数结束，变量就会被销毁，因此，超出了范围。

为了使 TensorFlow 能够正确转换创建状态的函数（因此，使用变量），您必须打破函数作用域，在函数之外声明变量。

简而言之，如果您有一个可以在 Eager 模式下正常工作的函数，例如：

def f():
    a = tf.constant([[10,10],[11.,1.]])
    x = tf.constant([[1.,0.],[0.,1.]])
    b = tf.Variable(12.)
    y = tf.matmul(a, x) + b
    return y

Run Code Online (Sandbox Code Playgroud)

您必须将其结构更改为：

b = None

@tf.function
def f():
    a = tf.constant([[10, 10], [11., 1.]])
    x = tf.constant([[1., 0.], [0., 1.]])
    global b
    if b is None:
        b = tf.Variable(12.)
    y = tf.matmul(a, x) + b
    print("PRINT: ", y)
    tf.print("TF-PRINT: ", y)
    return y

f()

Run Code Online (Sandbox Code Playgroud)

为了使它与tf.function装饰器正常工作。

我在几篇博客文章中介绍了这个（和其他）场景：第一部分在处理打破函数范围的状态一节中分析了这种行为（但是我建议从头开始阅读，同时阅读第 2 部分和第 3 部分）。

归档时间：	6 年，1 月前
查看次数：	11432 次
最近记录：	6 年，1 月前