Ela*_*iss 6 python redhat python-2.7 tensorflow
我正在尝试学习分布式TensorFlow.试过这里解释的片段代码:
with tf.device("/cpu:0"):
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
with tf.device("/cpu:1"):
y = tf.nn.softmax(tf.matmul(x, W) + b)
loss = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
Run Code Online (Sandbox Code Playgroud)
收到以下错误:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'MatMul': Operation was explicitly assigned to /device:CPU:1 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
[[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/device:CPU:1"](Placeholder, Variable/read)]]
意味着TensorFlow无法识别CPU:1.
我正在使用40个CPU(cat /proc/cpuinfo | grep processor | wc -l)的RedHat服务器上运行.
有任何想法吗?
点击评论中的链接:
结果应该将会话配置为设备计数 > 1:
config = tf.ConfigProto(device_count={"CPU": 8})
with tf.Session(config=config) as sess:
...
Run Code Online (Sandbox Code Playgroud)
有点令人震惊的是,我错过了一些如此基本的东西,而且没有人能够指出一个看起来太明显的错误。
不确定这是我的问题还是 TensorFlow 代码示例和文档的问题。既然是谷歌,我就不得不说是我。
| 归档时间: |
|
| 查看次数: |
3584 次 |
| 最近记录: |