Dan*_*anc 2 python parallel-processing
我有一个数据分析脚本,它接受一个参数,指定要执行的分析段.我想在'n'是机器上的核心数的时候运行脚本的'n'个实例.复杂的是,分析的部分比核心要多,所以我想最多运行,"n"个进程,其中一个完成,再启动另一个.在使用子进程模块之前,有没有人做过这样的事情?
Rah*_*tam 10
我认为多处理模块将帮助您实现所需.看看示例技术.
import multiprocessing
def do_calculation(data):
"""
@note: you can define your calculation code
"""
return data * 2
def start_process():
print 'Starting', multiprocessing.current_process().name
if __name__ == '__main__':
analsys_jobs = list(range(10)) # could be your analysis work
print 'analsys_jobs :', analsys_jobs
pool_size = multiprocessing.cpu_count() * 2
pool = multiprocessing.Pool(processes=pool_size,
initializer=start_process,
maxtasksperchild=2, )
#maxtasksperchild = tells the pool to restart a worker process \
# after it has finished a few tasks. This can be used to avoid \
# having long-running workers consume ever more system resources
pool_outputs = pool.map(do_calculation, analsys_jobs)
#The result of the map() method is functionally equivalent to the \
# built-in map(), except that individual tasks run in parallel. \
# Since the pool is processing its inputs in parallel, close() and join()\
# can be used to synchronize the main process with the \
# task processes to ensure proper cleanup.
pool.close() # no more tasks
pool.join() # wrap up current tasks
print 'Pool :', pool_outputs
Run Code Online (Sandbox Code Playgroud)
你可以找到很好的在这里多技术入手
| 归档时间: |
|
| 查看次数: |
239 次 |
| 最近记录: |