为什么ThreadPoolExecutor比for循环慢？

Question

为什么ThreadPoolExecutor比for循环慢？

Har*_*nan 3 python numpy python-multithreading concurrent.futures

代码1

def feedforward(self,d):
    out = []
    for neuron in self.layer:
        out.append(neuron.feedforward(d))
    return np.array(out)

Run Code Online (Sandbox Code Playgroud)

这是我为执行前馈而编写的原始代码。我想使用多线程提高执行速度，因此我编辑了要从模块ThreadPoolExecutor中使用的代码concurrent.futures

代码2

def parallel_feedforward(self,func,param):
    return func.feedforward(param)

def feedforward(self,d):
    out = []
    with ThreadPoolExecutor(max_workers = 4) as executor:
        new_d = np.tile(d,(len(self.layer),1))
        for o in executor.map(self.parallel_feedforward,self.layer,new_d):
            out.append(o)
    return np.array(out)

Run Code Online (Sandbox Code Playgroud)

变量d是一个向量，我使用np.tile()它以便executor.map正确地获取输入

计时两者的执行速度后。我发现 Code 1 明显比 Code 2 快（Code 2 几乎慢了 8-10 倍）。但是使用多线程的代码不会比循环的代码更快吗？是因为我写的代码错误还是因为其他原因。如果是因为我的代码中有一些错误，有人可以告诉我我做错了什么吗？

提前感谢您的帮助。

Answer 1

Ric*_*ard 6

哈里，

你应该快速谷歌一下Python和线程——特别是Python“线程”不会因为Python GIL而并行运行（……谷歌一下）。因此，如果上面的函数受 CPU 限制，那么它实际上不会像上面那样使用 python 线程运行得更快。

要真正并行运行，您需要使用 ProcessPoolExecutor - 它可以绕过线程中存在的 python“GIL 锁”。

至于为什么它的运行速度可能慢8-10 倍- 只有一个想法是，当你使用 futures 时，当你向执行者发出带有参数的调用时，futures 会拾取你的参数并传递给工作人员，然后工作人员将取消 - pickle 在线程/进程中使用。（如果这对你来说是新的，请快速谷歌一下 python pickling）

如果您的参数规模很大，这可能会花费大量时间。

所以这可能就是你看到经济放缓的原因。...我发现自己的代码速度大幅下降，因为我试图将大型参数传递给工作人员。

归档时间：	6 年，3 月前
查看次数：	3204 次
最近记录：	6 年，3 月前