没有 NCCL 的镜像策略

Question

没有 NCCL 的镜像策略

我是否编写了自定义代码（而不是使用 TensorFlow 中提供的股票示例脚本）：否
操作系统平台和发行版（例如 Linux Ubuntu 16.04）：Windows 10 x64
TensorFlow 从（源代码或二进制文件）安装：二进制文件
TensorFlow 版本（使用下面的命令）：1.8.0
Python 版本：3.6
Bazel 版本（如果从源代码编译）：-
GCC/编译器版本（如果从源代码编译）：-
CUDA/cuDNN 版本：9.0
GPU 型号和内存：3.5
要重现的确切命令：simple_tfkeras_example.py

我想使用 MirroredStrategy 在同一台机器上使用多个 GPU。我尝试了以下示例之一：https : //github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/distribute/python/examples/simple_tfkeras_example.py

结果是：ValueError: Op type not register 'NcclAllReduce' in binary running on RAID。确保 Op 和 Kernel 已在此过程中运行的二进制文件中注册。在构建 NodeDef 'NcclAllReduce' 时

我使用的是 Windows，因此 Nccl 不可用。是否可以强制 TensorFlow 不使用此库？

Answer 1

Aus*_*tin 5

在 Windows 上有一些 NCCL 的二进制文件，但处理它们可能很烦人。

作为替代方案，Tensorflow 在 MirroredStrategy 中为您提供了其他三个与 Windows 本机兼容的选项。它们是分层复制、减少到第一个 GPU 和减少到 CPU。您最有可能寻找的是分层副本，但您可以测试它们中的每一个，看看什么能给您最好的结果。

如果您使用的 tensorflow 版本早于 2.0，您将使用 tf.contrib.distribute：

# Hierarchical Copy
cross_tower_ops = tf.contrib.distribute.AllReduceCrossTowerOps(
        'hierarchical_copy', num_packs=number_of_gpus))
    strategy = tf.contrib.distribute.MirroredStrategy(cross_tower_ops=cross_tower_ops)

# Reduce to First GPU
cross_tower_ops = tf.contrib.distribute. ReductionToOneDeviceCrossTowerOps()
strategy = tf.contrib.distribute.MirroredStrategy(cross_tower_ops=cross_tower_ops)

# Reduce to CPU
cross_tower_ops = tf.contrib.distribute. ReductionToOneDeviceCrossTowerOps(
    reduce_to_device="/device:CPU:0")
strategy = tf.contrib.distribute.MirroredStrategy(cross_tower_ops=cross_tower_ops)

Run Code Online (Sandbox Code Playgroud)

2.0以后，只需要使用tf.distribute即可！以下是使用 2 个 GPU 设置 Xception 模型的示例：

strategy = tf.distribute.MirroredStrategy(devices=["/gpu:0", "/gpu:1"], 
                                          cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())
with strategy.scope():
    parallel_model = Xception(weights=None,
                              input_shape=(299, 299, 3),
                              classes=number_of_classes)
    parallel_model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，4 月前
查看次数：	1962 次
最近记录：	6 年，2 月前