Flask 中与 joblib 并行计算

Mig*_*uel 5 python parallel-processing flask joblib

我有一个python需要使用不同参数值重复调用的函数。我想在多个 CPU 上并行执行此操作。我已经使用该joblib模块成功完成了此操作。我现在想让我的代码作为网络应用程序使用,在多个 CPUflask上运行AWS EC2 instance。这是我尝试过的一个玩具示例:

from flask import Flask
from joblib import Parallel, delayed
from time import sleep

def myfunc(x):
    sleep(5)
    return x

application = Flask(__name__)

@application.route('/', methods = ['GET'])
def getresult():
    out = Parallel(n_jobs=-1, verbose=10)(delayed(myfunc)(i) for i in range(5))
    return str(sum(out))

if __name__ == "__main__":
    application.debug = True
    application.run()
Run Code Online (Sandbox Code Playgroud)

问题是该代码不能在多个 CPU 上并行运行。我收到以下警告和输出(经过的时间确认它没有并行运行):

    /Library/anaconda/lib/python3.6/site-packages/joblib/parallel.py:547:
    UserWarning: Multiprocessing-backed parallel loops cannot be nested below 
    threads, setting n_jobs=1
      **self._backend_args)
    [Parallel(n_jobs=-1)]: Done   1 out of   1 | elapsed:    5.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   2 out of   2 | elapsed:   10.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   3 out of   3 | elapsed:   15.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   4 out of   4 | elapsed:   20.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   5 out of   5 | elapsed:   25.0s remaining:    0.0s
    [Parallel(n_jobs=-1)]: Done   5 out of   5 | elapsed:   25.0s finished
Run Code Online (Sandbox Code Playgroud)

有什么建议么?

Seb*_*ner 1

看看你得到的 UserWarning:

UserWarning: Multiprocessing-backed parallel loops cannot be nested below 
threads, setting n_jobs=1
Run Code Online (Sandbox Code Playgroud)

也许这有帮助:

多处理支持的并行循环不能嵌套在线程下方,设置 n_jobs=1

Flask 可能会在后台旋转它自己的线程,因此您的 getresult() 可能不会在 MainThread 中运行。