当使用多个 GPU 对模型进行推理(例如调用方法:model(inputs))并计算其梯度时,机器只使用一个 GPU,其余空闲。
例如在下面的代码片段中:
import tensorflow as tf
import numpy as np
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
# Make the tf-data
path_filename_records = 'your_path_to_records'
bs = 128
dataset = tf.data.TFRecordDataset(path_filename_records)
dataset = (dataset
.map(parse_record, num_parallel_calls=tf.data.experimental.AUTOTUNE)
.batch(bs)
.prefetch(tf.data.experimental.AUTOTUNE)
)
# Load model trained using MirroredStrategy
path_to_resnet = 'your_path_to_resnet'
mirrored_strategy = tf.distribute.MirroredStrategy()
with mirrored_strategy.scope():
resnet50 = tf.keras.models.load_model(path_to_resnet)
for pre_images, true_label in dataset:
with tf.GradientTape() as tape:
tape.watch(pre_images)
outputs = resnet50(pre_images)
grads = tape.gradient(outputs, pre_images)
Run Code Online (Sandbox Code Playgroud)
仅使用一个 GPU。您可以使用 nvidia-smi 分析 GPU 的行为。我不知道它是否应该像这样,model(inputs) …