use*_*619 7 python multiprocessing
我正在使用Python多处理池模块来创建进程池并为其分配作业.
我创建了4个进程并分配了2个作业但是试图显示它们的进程号但是在显示中我只看到一个进程号"6952"...它不应该打印2个进程号
from multiprocessing import Pool
from time import sleep
def f(x):
import os
print "process id = " , os.getpid()
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
result = pool.map_async(f, (11,)) #Start job 1
result1 = pool.map_async(f, (10,)) #Start job 2
print "result = ", result.get(timeout=1)
print "result1 = ", result1.get(timeout=1)
Run Code Online (Sandbox Code Playgroud)
结果: -
result = process id = 6952
process id = 6952
[121]
result1 = [100]
Run Code Online (Sandbox Code Playgroud)
这只是时间问题。Windows 需要在 中生成 4 个进程Pool,然后需要启动、初始化并准备从Queue. 在 Windows 上,这需要每个子进程重新导入__main__模块,并在每个子进程中取消pickleQueue内部使用的实例Pool。这需要相当长的时间。事实上,足够长的时间,当您的两个map_async()调用都在所有进程Pool启动并运行之前执行时。如果您添加一些跟踪中每个工作人员运行的函数,您可以看到这一点Pool:
while maxtasks is None or (maxtasks and completed < maxtasks):
try:
print("getting {}".format(current_process()))
task = get() # This is getting the task from the parent process
print("got {}".format(current_process()))
Run Code Online (Sandbox Code Playgroud)
输出:
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
process id = 5145
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
process id = 5145
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
result = [121]
result1 = [100]
getting <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-3, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-4, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
Run Code Online (Sandbox Code Playgroud)
正如您所看到的,Worker-1在工作人员 2-4 尝试从Queue. 如果您在主进程中sleep实例化 后但在调用 之前添加调用,您将看到不同的进程处理每个请求:Poolmap_async
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-3, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-4, started daemon)>
# <sleeping here>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
process id = 5183
got <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
process id = 5184
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
result = [121]
result1 = [100]
got <ForkServerProcess(ForkServerPoolWorker-3, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-4, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
Run Code Online (Sandbox Code Playgroud)
(请注意,您看到的额外"getting/"got"语句是发送到每个进程以正常关闭它们的哨兵)。
在 Linux 上使用 Python 3.x,我可以使用'spawn'和'forkserver'上下文重现此行为,但不能使用'fork'. 大概是因为分叉子进程比生成它们并重新导入__main__.