use*_*966 17 python numpy multiprocessing
我试图使用多处理运行一个简单的测试.测试工作正常,直到我导入numpy(即使它没有在程序中使用).这是代码:
from multiprocessing import Pool
import time
import numpy as np #this is the problematic line
def CostlyFunc(N):
""""""
tstart = time.time()
x = 0
for i in xrange(N):
for j in xrange(N):
if i % 2: x += 2
else: x -= 2
print "CostlyFunc : elapsed time %f s" % (time.time() - tstart)
return x
#serial application
ResultList0 = []
StartTime = time.time()
for i in xrange(3):
ResultList0.append(CostlyFunc(5000))
print "Elapsed time (serial) : ", time.time() - StartTime
#multiprocessing application
StartTime = time.time()
pool = Pool()
asyncResult = pool.map_async(CostlyFunc, [5000, 5000, 5000])
ResultList1 = asyncResult.get()
print "Elapsed time (multiporcessing) : ", time.time() - StartTime
Run Code Online (Sandbox Code Playgroud)
如果我不导入numpy,结果是:
CostlyFunc : elapsed time 2.866265 s
CostlyFunc : elapsed time 2.793213 s
CostlyFunc : elapsed time 2.794936 s
Elapsed time (serial) : 8.45455098152
CostlyFunc : elapsed time 2.889815 s
CostlyFunc : elapsed time 2.891556 s
CostlyFunc : elapsed time 2.898898 s
Elapsed time (multiporcessing) : 2.91595196724
Run Code Online (Sandbox Code Playgroud)
总耗用时间与1个过程所需的时间相似,这意味着计算已经并行化.如果我导入numpy结果变为:
CostlyFunc : elapsed time 2.877116 s
CostlyFunc : elapsed time 2.866778 s
CostlyFunc : elapsed time 2.860894 s
Elapsed time (serial) : 8.60492110252
CostlyFunc : elapsed time 8.450145 s
CostlyFunc : elapsed time 8.473006 s
CostlyFunc : elapsed time 8.506402 s
Elapsed time (multiporcessing) : 8.55398178101
Run Code Online (Sandbox Code Playgroud)
串行和多处理方法的总耗用时间相同,因为只使用了一个核心.很明显,问题来自于numpy.我的多处理版本和NumPy之间是否存在不兼容性?
我目前在linux上使用Python2.7,NumPy 1.6.2和多处理0.70a1
小智 4
(第一篇文章,如果表述不当或对齐不当,敬请谅解)
您可以通过将 MKL_NUM_THREADS 设置为 1 来停止 Numpy 使用多线程
在 debian 下我使用:
export MKL_NUM_THREADS=1
Run Code Online (Sandbox Code Playgroud)
来源自相关 stackoverflow 帖子:Python: How do you stop numpy from multithreading?
结果:
user@pc:~/tmp$ python multi.py
CostlyFunc : elapsed time 3.847009 s
CostlyFunc : elapsed time 3.253226 s
CostlyFunc : elapsed time 3.415734 s
Elapsed time (serial) : 10.5163660049
CostlyFunc : elapsed time 4.218424 s
CostlyFunc : elapsed time 5.252429 s
CostlyFunc : elapsed time 4.862513 s
Elapsed time (multiporcessing) : 9.11713695526
user@pc:~/tmp$ export MKL_NUM_THREADS=1
user@pc:~/tmp$ python multi.py
CostlyFunc : elapsed time 3.014677 s
CostlyFunc : elapsed time 3.102548 s
CostlyFunc : elapsed time 3.060915 s
Elapsed time (serial) : 9.17840886116
CostlyFunc : elapsed time 3.720322 s
CostlyFunc : elapsed time 3.950583 s
CostlyFunc : elapsed time 3.656165 s
Elapsed time (multiporcessing) : 7.399310112
Run Code Online (Sandbox Code Playgroud)
我不确定这是否有帮助,因为我想最终您希望 numpy 并行运行,也许尝试调整 numpy 到您的机器的线程数。