> 90% 的时间花在“thread.lock”对象的“acquire”方法上

Sch*_*tor 10 profiling multiprocessing cprofile python-2.7

为了确定使用大部分计算时间的步骤,我运行了 cProfile 并得到以下结果:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.014    0.014  216.025  216.025 func_poolasync.py:2(<module>)
    11241  196.589    0.017  196.589    0.017 {method 'acquire' of 'thread.lock' objects}
      982    0.010    0.000  196.532    0.200 threading.py:309(wait)
     1000    0.002    0.000  196.498    0.196 pool.py:565(get)
     1000    0.005    0.000  196.496    0.196 pool.py:557(wait)
515856/3987    0.350    0.000   13.434    0.003 artist.py:230(stale)
Run Code Online (Sandbox Code Playgroud)

大部分时间花费在这一步上显然是method 'acquire' of 'thread.lock' objects。我没有使用过线程;相反,我已经使用pool.apply_async了几个处理器,所以我很困惑为什么thread.lock会出现这个问题?

我希望能说明为什么这是瓶颈?而这一次又是如何打倒的?

代码如下所示:

path='/usr/home/work'
filename='filename'

with open(path+filename+'/'+'result.pickle', 'rb') as f:
     pdata = pickle.load(f)

if __name__ == '__main__':
    pool = Pool()    
    results=[]
    data=list(range(1000))
    print('START')
    start_time = int(round(time.time()))
    result_objects = [pool.apply_async(func, args=(nomb,pdata[0],pdata[1],pdata[2])) for nomb in data]

    results = [r.get() for r in result_objects]

    pool.close()
    pool.join()
    print('END', int(round(time.time()))-start_time)
Run Code Online (Sandbox Code Playgroud)

修订:

通过从 切换pool.apply_asyncpool.map我能够将执行时间减少约 3 倍。

输出:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.113    0.113   70.824   70.824 func.py:2(<module>)
     4329   28.048    0.006   28.048    0.006 {method 'acquire' of 'thread.lock' objects}
        4    0.000    0.000   28.045    7.011 threading.py:309(wait)
        1    0.000    0.000   28.044   28.044 pool.py:248(map)
        1    0.000    0.000   28.044   28.044 pool.py:565(get)
        1    0.000    0.000   28.044   28.044 pool.py:557(wait)
Run Code Online (Sandbox Code Playgroud)

代码修改:

if __name__ == '__main__':
    pool = Pool()    
    data=list(range(1000))

    print('START')
    start_time = int(round(time.time()))
    funct = partial(func,pdata[0],pdata[1],pdata[2])
    results = pool.map(funct,data)
    print('END', int(round(time.time()))-start_time)
Run Code Online (Sandbox Code Playgroud)

但是,已经发现一些迭代会导致无意义的结果。我不确定为什么会发生这种情况,但是,可以看出“thread.lock”对象的“获取”方法仍然是速率确定步骤。