Tensorflow:如何在python中使用渐变来编写op？

Question

Tensorflow:如何在python中使用渐变来编写op？

Ale*_*x I 10 python neural-network gradient-descent tensorflow

我想在python中编写TensorFlow操作,但我希望它是可微分的(能够计算渐变).

这个问题询问如何在python中编写一个op,答案建议使用py_func(没有渐变):Tensorflow:用Python编写一个Op

TF文档描述了如何仅从C++代码添加操作:https://www.tensorflow.org/versions/r0.10/how_tos/adding_an_op/index.html

在我的情况下,我正在进行原型设计,所以我不关心它是否在GPU上运行,我不关心它是否可以从TF python API以外的任何东西使用.

Answer 1

pat*_*_ai 12

是的,正如在@ Yaroslav的回答中提到的那样,关键是他引用的链接:这里和这里.我想通过给出一个具体的例子来详细说明这个答案.

Modulo opperation:让我们在tensorflow中实现逐元素模运算(它已经存在,但它的梯度没有定义,但是对于我们将从头开始实现它的例子).

Numpy函数:第一步是为numpy数组定义我们想要的操作.元素模数操作已经在numpy中实现,因此很容易:

import numpy as np
def np_mod(x,y):
    return (x % y).astype(np.float32)

Run Code Online (Sandbox Code Playgroud)

原因.astype(np.float32)是因为默认情况下tensorflow采用float32类型,如果你给它float64(numpy默认值),它会抱怨.

渐变函数:接下来,我们需要为opperation的每个输入定义我们操作的渐变函数作为张量流函数.该功能需要采取非常具体的形式.它需要采用opperation的张量流表示op和输出的梯度,grad并说明如何传播渐变.在我们的例子中,操作的梯度mod很容易,相对于第一个参数,导数是1 相对于第二(几乎无处不在,和无限在有限数量的斑点,但让我们忽略了,看到https://math.stackexchange.com/questions/1849280/derivative-of-remainder-function-wrt-denominator为细节).所以我们有

def modgrad(op, grad):
    x = op.inputs[0] # the first argument (normally you need those to calculate the gradient, like the gradient of x^2 is 2x. )
    y = op.inputs[1] # the second argument

    return grad * 1, grad * tf.neg(tf.floordiv(x, y)) #the propagated gradient with respect to the first and second argument respectively

Run Code Online (Sandbox Code Playgroud)

grad函数需要返回一个n元组,其中n是操作的参数个数.请注意,我们需要返回输入的tensorflow函数.

使用渐变创建TF函数:如上面提到的来源中所解释的,使用tf.RegisterGradient [doc]和tf.Graph.gradient_override_map [doc]来定义函数的渐变是一种破解.

从harpone复制代码我们可以修改tf.py_func函数,使其同时定义渐变:

import tensorflow as tf

def py_func(func, inp, Tout, stateful=True, name=None, grad=None):

    # Need to generate a unique name to avoid duplicates:
    rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))

    tf.RegisterGradient(rnd_name)(grad)  # see _MySquareGrad for grad example
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)

Run Code Online (Sandbox Code Playgroud)

该stateful方法是,告诉tensorflow函数总是给出相同的输入(状态= FALSE)在这种情况下tensorflow只需将tensorflow图,这是我们的情况下,将可能在大多数情况下,相同的情况下输出.

将它们结合在一起:既然我们拥有所有的部分,我们可以将它们组合在一起:

from tensorflow.python.framework import ops

def tf_mod(x,y, name=None):

    with ops.op_scope([x,y], name, "mod") as name:
        z = py_func(np_mod,
                        [x,y],
                        [tf.float32],
                        name=name,
                        grad=modgrad)  # <-- here's the call to the gradient
        return z[0]

Run Code Online (Sandbox Code Playgroud)

tf.py_func作用于张量列表(并返回张量列表),这就是我们拥有[x,y](并返回z[0])的原因.现在我们完成了.我们可以测试一下.

测试:

with tf.Session() as sess:

    x = tf.constant([0.3,0.7,1.2,1.7])
    y = tf.constant([0.2,0.5,1.0,2.9])
    z = tf_mod(x,y)
    gr = tf.gradients(z, [x,y])
    tf.initialize_all_variables().run()

    print(x.eval(), y.eval(),z.eval(), gr[0].eval(), gr[1].eval())

Run Code Online (Sandbox Code Playgroud)

[0.30000001 0.69999999 1.20000005 1.70000005] [0.2 0.5 1. 2.9000001] [0.10000001 0.19999999 0.20000005 1.70000005] [1. 1. 1. 1.] [-1.-1.-1.0]

成功!

归档时间：	9 年，3 月前
查看次数：	5089 次
最近记录：	7 年，5 月前