nak*_*ung 0 distributed tensorflow
似乎tf.train.replica_device_setter不允许指定使用的gpu.
我想做的是如下:
with tf.device(
tf.train.replica_device_setter(
worker_device='/job:worker:task:%d/gpu:%d' % (deviceindex, gpuindex)):
<build-some-tf-graph>
Run Code Online (Sandbox Code Playgroud)
如果您的参数没有分片,您可以使用replica_device_setter下面的简化版本来完成:
def assign_to_device(worker=0, gpu=0, ps_device="/job:ps/task:0/cpu:0"):
def _assign(op):
node_def = op if isinstance(op, tf.NodeDef) else op.node_def
if node_def.op == "Variable":
return ps_device
else:
return "/job:worker/task:%d/gpu:%d" % (worker, gpu)
return _assign
with tf.device(assign_to_device(1, 2)):
# this op goes on worker 1 gpu 2
my_op = tf.ones(())
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2057 次 |
| 最近记录: |