我想在张量流会话中并行运行多个train_op.答案在这里说,tensorflow sess.run()可以释放python的GIL.我在anwser中尝试了这个例子,但似乎我们仍然有一个GIL.我有8个GPU可用.当num_threads为4时,需要24秒.当num_threads为8时,需要54秒.
这是代码:
from threading import Thread
import tensorflow as tf
import time
num_threads = 8
a = []
for i in range(num_threads):
with tf.device('/cpu:0'):
a.append(tf.get_variable(name='a_%d'%i, shape=[5000, 50, 5, 5, 5, 5], initializer=tf.truncated_normal_initializer()))
b = []
for i in range(num_threads):
with tf.device('/cpu:0'):
b.append(tf.get_variable(name='b_%d'%i, shape=[5000, 50, 5, 5, 5, 5], initializer=tf.truncated_normal_initializer()))
train_ops = []
for i in range(num_threads):
with tf.device('gpu:%d'%i):
loss = tf.multiply(a[i], b[i], name='loss_%d'%i)
train_ops.append(tf.train.GradientDescentOptimizer(0.01).minimize(loss))
sess = tf.Session()
sess.run(tf.initialize_all_variables())
def train_function(train_op):
for i in range(20):
sess.run(train_op)
train_threads = …Run Code Online (Sandbox Code Playgroud)