在TensorFlow中将计算值缓存为常量

use*_*768 6 python numpy constants linear-regression tensorflow

假设我想使用闭合形式解决方案计算TensorFlow中的最小二乘系数.通常,我会这样做,

beta_hat = tf.matmul(
           tf.matmul(tf.matrix_inverse(tf.matmul(tf.transpose(X), X)), tf.transpose(X)), y
)
Run Code Online (Sandbox Code Playgroud)

其中Xy分别对应于所述协变量和目标变量,TensorFlow占位符.

如果我想进行预测,我会做类似的事情,

y_pred = tf.matmul(X, beta_hat)
Run Code Online (Sandbox Code Playgroud)

如果我要执行,

sess.run(y_pred, feed_dict={X: data_X})
Run Code Online (Sandbox Code Playgroud)

我当然会得到一个错误,我没有为占位符提供必要的值y.我想beta_hat在计算它之后可以灵活地将其视为常量(这样我就不需要为新的协变量矩阵定义一个新的占位符来进行预测).实现这一目标的一种方法是,

# Make it constant.
beta_hat = sess.run(beta_hat, feed_dict={X: data_X, y: data_y})
y_pred = tf.matmul(X, beta_hat)
Run Code Online (Sandbox Code Playgroud)

我想知道是否有更优雅的方法将张量处理为常量,这样我既不需要执行会话也不需要获取常量,也不需要为输入数据创建单独的占位符来用于预测.

以下是一些示例代码,用于演示我正在描述的情况.

import numpy as np
import tensorflow as tf


n, k = 100, 5
X = tf.placeholder(dtype=tf.float32, shape=[None, k])
y = tf.placeholder(dtype=tf.float32, shape=[None, 1])

beta = np.random.normal(size=(k, ))
data_X = np.random.normal(size=(n, k))

data_y = data_X.dot(beta)
data_y += np.random.normal(size=data_y.shape) / 3.0
data_y = np.atleast_2d(data_y).T

# Convert to 32-bit precision.
data_X, data_y = np.float32(data_X), np.float32(data_y)

# Compute the least squares solution.
beta_hat = tf.matmul(
    tf.matmul(tf.matrix_inverse(tf.matmul(tf.transpose(X), X)),
              tf.transpose(X)), y
)

# Launch the graph
sess = tf.Session()
sess.run(tf.initialize_all_variables())

print "True beta: {}".format(beta)
print "Est. beta: {}".format(
    sess.run(beta_hat, feed_dict={X: data_X, y: data_y}).ravel()
)

# # This would error.
# y_pred = tf.matmul(X, beta_hat)
# print "Predictions:"
# print sess.run(y_pred, feed_dict={X: data_X})

# Make it constant.
beta_hat = sess.run(beta_hat, feed_dict={X: data_X, y: data_y})

# This will no longer error.
y_pred = tf.matmul(X, beta_hat)
print "Predictions:"
print sess.run(y_pred, feed_dict={X: data_X})
Run Code Online (Sandbox Code Playgroud)

mrr*_*rry 2

也许与直觉相反,在后续步骤中重新用作beta_hat常量的最简单方法是将其分配给 a tf.Variable

n, k = 100, 5
X = tf.placeholder(dtype=tf.float32, shape=[None, k])
y = tf.placeholder(dtype=tf.float32, shape=[None, 1])

beta = np.random.normal(size=(k, ))
data_X = np.random.normal(size=(n, k))

data_y = data_X.dot(beta)
data_y += np.random.normal(size=data_y.shape) / 3.0
data_y = np.atleast_2d(data_y).T

# Convert to 32-bit precision.
data_X, data_y = np.float32(data_X), np.float32(data_y)

# Compute the least squares solution.
beta_hat = tf.matmul(
    tf.matmul(tf.matrix_inverse(tf.matmul(tf.transpose(X), X)),
              tf.transpose(X)), y
)

beta_hat_cached = tf.Variable(beta_hat)

# Launch the graph
sess = tf.Session()

print "True beta: {}".format(beta)
# Run the initializer, which computes `beta_hat` once:
sess.run(beta_hat_cached.initializer, feed_dict={X: data_X, y: data_y})
# To access the value of `beta_hat`, "run" the variable to read its contents.
print "Est. beta: {}".format(beta_hat_cached.ravel())

# Use the cached version to compute predictions.
y_pred = tf.matmul(X, beta_hat_cached)
print "Predictions:"
print sess.run(y_pred, feed_dict={X: data_X})
Run Code Online (Sandbox Code Playgroud)