jaa*_*aap 0 python numpy pool multiprocessing python-3.x
我有这个非常简单的python代码,我希望通过并行化来加速它.然而,无论我做什么,multiprocessing.Pool.map都不会在标准地图上获得任何东西.
我已经读过其他线程,其中人们使用这个非常小的函数,这些函数不能很好地并行化并导致过多的开销,但我认为这不应该是这种情况.
难道我做错了什么?
这是一个例子
#!/usr/bin/python
import numpy, time
def AddNoise(sample):
#time.sleep(0.001)
return sample + numpy.random.randint(0,9,sample.shape)
#return sample + numpy.ones(sample.shape)
n=100
m=10000
start = time.time()
A = list([ numpy.random.randint(0,9,(n,n)) for i in range(m) ])
print("creating %d numpy arrays of %d x %d took %.2f seconds"%(m,n,n,time.time()-start))
for i in range(3):
start = time.time()
A = list(map(AddNoise, A))
print("adding numpy arrays took %.2f seconds"%(time.time()-start))
for i in range(3):
import multiprocessing
start = time.time()
with multiprocessing.Pool(processes=2) as pool:
A = list(pool.map(AddNoise, A, chunksize=100))
print("adding numpy arrays with multiprocessing Pool took %.2f seconds"%(time.time()-start))
for i in range(3):
import concurrent.futures
start = time.time()
with concurrent.futures.ProcessPoolExecutor(max_workers=2) as executor:
A = list(executor.map(AddNoise, A))
print("adding numpy arrays with concurrent.futures.ProcessPoolExecutor took %.2f seconds"%(time.time()-start))
Run Code Online (Sandbox Code Playgroud)
这导致我的4核/ 8线笔记本电脑上的以下输出,否则是空闲的
$ python test-pool.py
creating 10000 numpy arrays of 100 x 100 took 1.54 seconds
adding numpy arrays took 1.65 seconds
adding numpy arrays took 1.51 seconds
adding numpy arrays took 1.51 seconds
adding numpy arrays with multiprocessing Pool took 1.99 seconds
adding numpy arrays with multiprocessing Pool took 1.98 seconds
adding numpy arrays with multiprocessing Pool took 1.94 seconds
adding numpy arrays with concurrent.futures.ProcessPoolExecutor took 3.32 seconds
adding numpy arrays with concurrent.futures.ProcessPoolExecutor took 3.17 seconds
adding numpy arrays with concurrent.futures.ProcessPoolExecutor took 3.25 seconds
Run Code Online (Sandbox Code Playgroud)
问题在于结果转移.考虑到使用多处理,您在子进程内创建的数组需要转移回主进程..这是一个开销.
我检查了这个以这种方式修改AddNoise函数,这保留了计算时间,但丢弃了传输时间:
def AddNoise(sample):
sample + numpy.random.randint(0,9,sample.shape)
return None
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
512 次 |
| 最近记录: |