Python ValueError:池未在异步多处理中运行

Tha*_*yen 3 python pool multiprocessing

我有一个简单的代码:

path = [filepath1, filepath2, filepath3]

def umap_embedding(filepath):
    file = np.genfromtxt(filepath,delimiter=' ')
    if len(file) > 20000:
        file = file[np.random.choice(file.shape[0], 20000, replace=False), :]
    neighbors = len(file)//200

    if neighbors >= 2:
        neighbors = neighbors
    else:
        neighbors = 2

    embedder = umap.UMAP(n_neighbors=neighbors,
                         min_dist=0.1,
                         metric='correlation', n_components=2)
    embedder.fit(file)
    embedded = embedder.transform(file)
    name = 'file'
    np.savetxt(name,embedded,delimiter=",")

if __name__ == '__main__':
    p = Pool(processes = 20)
    start = time.time()
    for filepath in path:
        p.apply_async(umap_embedding, [filepath])
        p.close()
        p.join()

    print("Complete")
    end = time.time()
    print('total time (s)= ' + str(end-start))
Run Code Online (Sandbox Code Playgroud)

当我执行时,控制台返回错误:

Traceback (most recent call last):
  File "/home/cngc3/CBC/parallel.py", line 77, in <module>
    p.apply_async(umap_embedding, [filepath])
  File "/home/cngc3/anaconda3/envs/CBC/lib/python3.6/multiprocessing/pool.py", line 355, in apply_async
    raise ValueError("Pool not running")
ValueError: Pool not running
Run Code Online (Sandbox Code Playgroud)

我试图在Stackoverflow和Google上找到针对此问题的解决方案,但没有相关问题。谢谢您的帮助。

Mic*_*her 10

p.close()并且p.join()必须放在for-loop之后。否则,该池在循环的第一次迭代中关闭,并且在第二次迭代中不接受新作业。