如何将 perreplica 转换为张量？

Question

如何将 perreplica 转换为张量？

yan*_*ang 5 python tensorflow tensorflow2.0

在 tensorflow2.0 中使用多 GPU 进行训练时，perreplica 将通过以下代码减少：

strategy.reduce(tf.distribute.ReduceOp.SUM, per_replica_losses, axis=None)

Run Code Online (Sandbox Code Playgroud)

但是，如果我只想收集（没有“总和减少”或“平均减少”）所有 gpu 的预测到张量中：

per_replica_losses, per_replica_predicitions = strategy.experimental_run_v2(train_step, args=(dataset_inputs,))
# how to convert per_replica_predicitions to a tensor ?

Run Code Online (Sandbox Code Playgroud)

Answer 1

小智 8

简而言之，您可以将PerReplica结果转换为这样的张量元组：

tensors_tuple = per_replica_predicitions.values

Run Code Online (Sandbox Code Playgroud)

返回tensors_tuple将是predictions来自每个副本/设备的元组：

(predicton_tensor_from_dev0, prediction_tensor_from_dev1,...)

Run Code Online (Sandbox Code Playgroud)

此元组中的元素数量由分布式策略可用的设备决定。特别地，如果策略在单个副本/设备上运行，则 strategy.experimental_run_v2 的返回值将与直接调用 train_step 函数（张量或张量列表由您决定train_step）相同。所以你可能想写这样的代码：

per_replica_losses, per_replica_predicitions = strategy.experimental_run_v2(train_step, args=(dataset_inputs,))

if strategy.num_replicas_in_sync > 1:
    predicition_tensors = per_replica_predicitions.values
else:
    predicition_tensors = per_replica_predicitions

Run Code Online (Sandbox Code Playgroud)

PerReplica是一个封装了分布式运行结果的类对象。你可以在这里找到它的定义，有更多的属性/方法供我们操作PerReplica对象。

归档时间：	6 年，6 月前
查看次数：	2118 次
最近记录：	6 年，4 月前