Python多处理比单线程慢

40P*_*lot 2 python multithreading

我一直在玩多处理问题,并注意到我并行化时算法比单线程时慢.

在我的代码中,我不共享内存.而且我很确定我的算法(参见代码),它只是嵌套循环,是CPU绑定的.

但是,无论我做什么.并行代码在我的所有计算机上运行速度慢了10-20%.

我也在一个20 CPU的虚拟机上运行它,并且每次单线程击败多线程(实际上比我的计算机更慢).

from multiprocessing.dummy import Pool as ThreadPool
from multi import chunks
from random import random
import logging
import time
from multi import chunks

## Product two set of stuff we can iterate over
S = []
for x in range(100000):
  S.append({'value': x*random()})
H =[]
for x in range(255):
  H.append({'value': x*random()})

# the function for each thread
# just nested iteration
def doStuff(HH):
  R =[]
  for k in HH['S']:
    for h in HH['H']:
      R.append(k['value'] * h['value'])
  return R

# we will split the work
# between the worker thread and give it
# 5 item each to iterate over the big list
HChunks = chunks(H, 5)
XChunks = []

# turn them into dictionary, so i can pass in both
# S and H list
# Note: I do this because I'm not sure if I use the global
# S, will it spend too much time on cache synchronizatio or not
# the idea is that I dont want each thread to share anything.
for x in HChunks:
  XChunks.append({'H': x, 'S': S})

print("Process")
t0 = time.time()
pool = ThreadPool(4)
R = pool.map(doStuff, XChunks)
pool.close()
pool.join()

t1 = time.time()

# measured time for 4 threads is slower 
# than when i have this code just do 
# doStuff(..) in non-parallel way
# Why!?

total = t1-t0
print("Took", total, "secs")
Run Code Online (Sandbox Code Playgroud)

有许多相关的问题已经打开,但很多都是针对错误构造的代码 - 每个工作者都是IO绑定的等等.

Mis*_*agi 6

您正在使用多线程,而不是多处理.虽然许多语言允许线程并行运行,但python却没有.线程只是一个独立的控制状态,即它拥有自己的堆栈,当前函数等.python解释器只是偶尔在执行每个堆栈之间切换.

基本上,所有线程都在单个核心上运行.它们只会在您没有 CPU限制时加速您的程序.

multiprocessing.dummy复制多处理的API,但只不过是线程模块的包装器.

如果您受CPU限制,多线程通常比单线程.这是因为工作和处理资源保持不变,但是您增加了管理线程的开销,例如在它们之间切换.

如何解决这个问题:而不是使用from multiprocessing.dummy import Pool as ThreadPooldo multiprocessing.Pool as ThreadPool.


您可能想要阅读GIL,即Global Interpreter Lock.这是阻止线程并行运行的原因(这对单线程性能有影响).除了CPython之外的Python解释器可能没有GIL并且能够在多个内核上运行多线程.

  • 我的生活一直是个谎言. (6认同)
  • @40Plot 不,Python 的方法很奇怪,而且很古老 (2认同)