这是我的python-3.6代码,用于模拟1D反射的随机游走,使用该joblib模块K在Linux集群计算机上的工作人员之间同时生成400个实现。
但是,我注意到for的运行时K=3比for差K=1,并且for的运行时K=5更糟!
谁能看到一种改善我使用率的方法joblib吗?
from math import sqrt
import numpy as np
import joblib as jl
import os
K = int(os.environ['SLURM_CPUS_PER_TASK'])
def f(j):
N = 10**6
p = 1/3
np.random.seed(None)
X = 2*np.random.binomial(1,p,N)-1 # X = 1 with probability p
s = 0 # X =-1 with probability 1-p
m = 0
for t in range(0,N):
s = max(0,s+X[t])
m = max(m,s)
return m
pool = jl.Parallel(n_jobs=K)
W = …Run Code Online (Sandbox Code Playgroud) python parallel-processing performance parallelism-amdahl joblib