Sta*_*Bak 12 python numpy scipy python-multiprocessing
我正在使用python scipy.integrate来模拟29维线性微分方程组.由于我需要解决几个问题实例,我想我可以通过并行计算来加速它multiprocessing.Pool.由于线程之间没有共享数据或同步(问题是令人尴尬的并行),我认为这显然应该有效.然而,在我编写完代码之后,我得到了非常奇怪的性能测量:
什么是令人震惊的是,我认为应该是最快的设置,实际上是最慢的,而变异是两个数量级.这是一个确定性的计算; 计算机不应该以这种方式工作.什么可能导致这个?
我在另一台计算机上尝试了相同的代码,但我没有看到这种效果.
两台机器都使用Ubuntu 64位,Python 2.7.6,scipy版本0.18.0和numpy版本1.8.2.我没有看到Intel(R)Core(TM)i5-5300U CPU @ 2.30GHz处理器的可变性.我确实看到了英特尔(R)Core(TM)i7-2670QM CPU @ 2.20GHz的问题.
一个想法是处理器之间可能存在共享缓存,并且通过并行运行它不能适应缓存中的jacobian矩阵的两个实例,因此它们不断地相互竞争以使缓存相互减慢它们是连续运行或没有jacobian.但它不是一个百万变量系统.jacobian是一个29x29矩阵,占用6728个字节.处理器上的1级缓存为4 x 32 KB,大得多.处理器之间是否还有其他共享资源可能是罪魁祸首?我们怎么测试呢?
我注意到的另一件事是每个python进程在运行时似乎占用了几百%的CPU.这似乎意味着代码已经在某些时候并行化了(可能在低级库中).这可能意味着进一步的并行化无济于事,但我不希望这种急剧放缓.
尝试在更多机器上查看(1)其他人是否可以体验到减速并且(2)发生减速的系统的共同特征是什么是好的.该代码使用大小为2的多处理池进行两次并行计算的10次试验,为10次试验中的每次试验打印出每次scipy.ode.integrate调用的时间.
'odeint with multiprocessing variable execution time demonsrtation'
from numpy import dot as npdot
from numpy import add as npadd
from numpy import matrix as npmatrix
from scipy.integrate import ode
from multiprocessing import Pool
import time
def main():
"main function"
pool = Pool(2) # try Pool(1)
params = [0] * 2
for trial in xrange(10):
res = pool.map(run_one, params)
print "{}. times: {}ms, {}ms".format(trial, int(1000 * res[0]), int(1000 * res[1]))
def run_one(_):
"perform one simulation"
final_time = 2.0
init_state = [0.1 if d < 7 else 0.0 for d in xrange(29)]
(a_matrix, b_vector) = get_dynamics()
derivative = lambda dummy_t, state: npadd(npdot(a_matrix, state), b_vector)
jacobian = lambda dummy_t, dummy_state: a_matrix
#jacobian = None # try without the jacobian
#print "jacobian bytes:", jacobian(0, 0).nbytes
solver = ode(derivative, jacobian)
solver.set_integrator('vode')
solver.set_initial_value(init_state, 0)
start = time.time()
solver.integrate(final_time)
dif = time.time() - start
return dif
def get_dynamics():
"return a tuple (A, b), which are the system dynamics x' = Ax + b"
return \
(
npmatrix([
[0, 0, 0, 0.99857378006, 0.053384274244, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ],
[0, 0, 1, -0.003182219341, 0.059524655342, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ],
[0, 0, -11.570495605469, -2.544637680054, -0.063602626324, 0.106780529022, -0.09491866827, 0.007107574493, -5.20817921341, -23.125876742495, -4.246931301528, -0.710743697134, -1.486697327603, -0.044548215175, 0.03436637817, 0.022990248611, 0.580153205353, 1.047552018229, 11.265023544535, 2.622275290571, 0.382949404795, 0.453076470454, 0.022651889536, 0.012533628369, 0.108399390974, -0.160139432044, -6.115359574845, -0.038972389136, 0, ],
[0, 0, 0.439356565475, -1.998182296753, 0, 0.016651883721, 0.018462046981, -0.001187470742, -10.778778281386, 0.343052863546, -0.034949331535, -3.466737362551, 0.013415853489, -0.006501746896, -0.007248032248, -0.004835912875, -0.152495086764, 2.03915052839, -0.169614300211, -0.279125393264, -0.003678218266, -0.001679708185, 0.050812027754, 0.043273505033, -0.062305315646, 0.979162836629, 0.040401368402, 0.010697028656, 0, ],
[0, 0, -2.040895462036, -0.458999156952, -0.73502779007, 0.019255757332, -0.00459562242, 0.002120360732, -1.06432932386, -3.659159530947, -0.493546966858, -0.059561101143, -1.953512259413, -0.010939065041, -0.000271004496, 0.050563886711, 1.58833954495, 0.219923768171, 1.821923233098, 2.69319056633, 0.068619628466, 0.086310028398, 0.002415425662, 0.000727041422, 0.640963888079, -0.023016712545, -1.069845542887, -0.596675149197, 0, ],
[-32.103607177734, 0, -0.503355026245, 2.297859191895, 0, -0.021215811372, -0.02116791904, 0.01581159234, 12.45916782984, -0.353636907076, 0.064136531117, 4.035326800046, -0.272152744884, 0.000999589868, 0.002529691904, 0.111632959213, 2.736421830861, -2.354540136198, 0.175216915979, 0.86308171287, 0.004401276193, 0.004373406589, -0.059795009475, -0.051005479746, 0.609531777761, -1.1157829788, -0.026305051933, -0.033738880627, 0, ],
[0.102161169052, 32.057830810547, -2.347217559814, -0.503611564636, 0.83494758606, 0.02122657001, -0.037879735231, 0.00035400386, -0.761479736492, -5.12933410588, -1.131382179292, -0.148788337148, 1.380741054924, -0.012931029503, 0.007645723855, 0.073796656681, 1.361745395486, 0.150700793731, 2.452437244444, -1.44883919298, 0.076516270282, 0.087122640348, 0.004623192159, 0.002635233443, -0.079401941141, -0.031023369979, -1.225533436977, 0.657926151362, 0, ],
[-1.910972595215, 1.713829040527, -0.004005432129, -0.057411193848, 0, 0.013989634812, -0.000906753354, -0.290513515472, -2.060635522957, -0.774845915178, -0.471751979387, -1.213891560083, 5.030515136324, 0.126407660877, 0.113188603433, -2.078420624662, -50.18523312358, 0.340665548784, 0.375863242926, -10.641168797333, -0.003634153255, -0.047962774317, 0.030509705209, 0.027584169642, -10.542357589006, -0.126840767097, -0.391839285172, 0.420788121692, 0, ],
[0.126296110212, -0.002898250629, -0.319316070797, 0.785201711657, 0.001772374259, 0.00000584372, 0.000005233812, -0.000097899495, -0.072611454126, 0.001666291957, 0.195701043078, 0.517339177294, 0.05236528267, -0.000003359731, -0.000003009077, 0.000056285381, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ],
[-0.018114066432, 0.077615035084, 0.710897211118, 2.454275059389, -0.012792968774, 0.000040510624, 0.000036282541, -0.000678672106, 0.010414324729, -0.044623231468, 0.564308412696, -1.507321670112, 0.066879720068, -0.000023290783, -0.00002085993, 0.000390189123, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ],
[-0.019957254425, 0.007108972111, 122.639137999354, 1.791704310155, 0.138329792976, 0.000000726169, 0.000000650379, -0.000012165459, -8.481152717711, -37.713895394132, -93.658221074435, -4.801972165378, -2.567389718833, 0.034138340146, -0.038880106034, 0.044603217363, 0.946016722396, 1.708172458034, 18.369114490772, 4.275967542224, 0.624449778826, 0.738801257357, 0.036936909247, 0.020437742859, 0.176759579388, -0.261128576436, -9.971904607075, -0.063549647738, 0, ],
[0.007852964982, 0.003925745426, 0.287856349997, 58.053471054491, 0.030698062827, -0.000006837601, -0.000006123962, 0.000114549925, -17.580742026275, 0.55713614874, 0.205946900184, -43.230778067404, 0.004227082975, 0.006053854501, 0.006646690253, -0.009138926083, -0.248663457912, 3.325105302428, -0.276578605231, -0.455150962257, -0.005997822569, -0.002738986905, 0.082855748293, 0.070563187482, -0.101597078067, 1.596654829885, 0.065879787896, 0.017442923517, 0, ],
[0.011497315687, -0.012583019909, 13.848373855148, 22.28881517216, 0.042287331657, 0.000197558695, 0.000176939544, -0.003309689199, -1.742140233901, -5.959510415282, -11.333020298294, -14.216479234895, -3.944800806497, 0.001304578929, -0.005139259078, 0.08647432259, 2.589998222025, 0.358614863147, 2.970887395829, 4.39160430183, 0.111893402319, 0.140739944934, 0.003938671797, 0.001185537435, 1.045176603318, -0.037531801533, -1.744525005833, -0.972957942438, 0, ],
[-16.939142002537, 0.618053512295, 107.92089190414, 204.524147386814, 0.204407545189, 0.004742101706, 0.004247169746, -0.079444150933, -2.048456967261, -0.931989524708, -66.540858220883, -116.470289129818, -0.561301215495, -0.022312225275, -0.019484747345, 0.243518778973, 4.462098610572, -3.839389874682, 0.285714413078, 1.40736916669, 0.007176864388, 0.007131419303, -0.097503691021, -0.083171197416, 0.993922379938, -1.819432085819, -0.042893874898, -0.055015718216, 0, ],
[-0.542809857455, 7.081822285872, -135.012404429101, 460.929268260027, 0.036498617908, 0.006937238413, 0.006213200589, -0.116219147061, -0.827454697348, 19.622217613195, 78.553728334274, -283.23862765888, 3.065444785639, -0.003847616297, -0.028984525722, 0.187507140282, 2.220506417769, 0.245737625222, 3.99902408961, -2.362524402134, 0.124769923797, 0.142065016461, 0.007538727793, 0.004297097528, -0.129475392736, -0.050587718062, -1.998394759416, 1.072835822585, 0, ],
[-1.286456393795, 0.142279456389, -1.265748910581, 65.74306027738, -1.320702989799, -0.061855995532, -0.055400100872, 1.036269854556, -4.531489334771, 0.368539277612, 0.002487097952, -42.326462719738, 8.96223401238, 0.255676968878, 0.215513465742, -4.275436802385, -81.833676543035, 0.555500345288, 0.612894852362, -17.351836610113, -0.005925968725, -0.078209662789, 0.049750119549, 0.044979645917, -17.190711833803, -0.206830688253, -0.638945907467, 0.686150823668, 0, ],
[0, 0, 0, 0, 0, -0.009702263896, -0.008689641059, 0.162541456323, 0, 0, 0, 0, 0, 0, 0, 0, -0.012, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ],
[-8.153162937544, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -0.005, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ],
[0, -3.261265175018, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -0.005, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ],
[0, 0, 0, 0.17441246156, -3.261265175018, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, ],
[0, 0, -3.261265175018, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -8.5, -18, 0, 0, 0, 0, 0, 0, 0, ],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, ],
[0, 0, 0, -8.153162937544, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -8.5, -18, 0, 0, 0, 0, 0, ],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, ],
[0, 0, 0, 0, 0, 0, 0, 0, 0.699960862226, 0.262038222227, 0.159589891262, 0.41155156501, -1.701619176699, -0.0427567124, -0.038285155304, 0.703045934017, 16.975651534025, -0.115788018654, -0.127109026104, 3.599544290134, 0.001229743857, 0.016223661959, -0.01033400498, -0.00934235613, -6.433934989563, 0.042639567847, 0.132540852847, -0.142338323726, 0, ],
[0, 0, 0, 0, 0, 0, 0, 0, -37.001496211974, 0.783588795613, -0.183854784348, -11.869599790688, -0.106084318011, -0.026306590251, -0.027118088888, 0.036744952758, 0.76460150301, 7.002366574508, -0.390318898363, -0.642631203146, -0.005701671024, 0.003522251111, 0.173867535377, 0.147911422248, 0.056092715216, -6.641979472328, 0.039602243105, 0.026181724138, 0, ],
[0, 0, 0, 0, 0, 0, 0, 0, 1.991401999957, 13.760045912368, 2.53041689113, 0.082528789604, 0.728264862053, 0.023902766734, -0.022896554363, 0.015327568208, 0.370476566397, -0.412566245022, -6.70094564846, -1.327038338854, -0.227019235965, -0.267482033427, -0.008650986307, -0.003394359441, 0.098792645471, 0.197714179668, -6.369398456151, -0.011976840769, 0, ],
[0, 0, 0, 0, 0, 0, 0, 0, 1.965859332057, -3.743127938662, -1.962645156793, 0.018929412474, 11.145046656101, -0.03600197464, -0.001222148117, 0.602488409354, 11.639787952728, -0.407672972316, 1.507740702165, -12.799953897143, 0.005393102236, -0.014208764492, -0.000915158115, -0.000640326416, -0.03653528842, 0.012458973237, -0.083125038259, -5.472831842357, 0, ],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ],
])
,
npmatrix([1.0 if d == 28 else 0.0 for d in xrange(29)])
)
if __name__ == "__main__":
main()
Run Code Online (Sandbox Code Playgroud)
这是一个演示问题的输出示例(每次运行略有不同).注意执行时间的大变化(超过两个数量级!).如果我使用大小为1的池(或运行没有池的代码),或者如果我在调用中没有使用显式的jacobian,那么这一切都会消失integrate.
- 次:5847毫秒,5760毫秒
- 次:4177毫秒,3991毫秒
- 时间:229毫秒,36毫秒
- 时间:1317毫秒,1544毫秒
- 次:87ms,100ms
- 次:113ms,102ms
- 次:4747毫秒,5077毫秒
- 次:597毫秒,48毫秒
- 时间:9毫秒,49毫秒
- 时间:135毫秒,109毫秒
这是关于@Dietrich 评论中提出的数学背景的格式化评论。由于它没有解决编程问题,我打算稍后删除这个答案,直到赏金结束。
正如@Dietrich 指出的,你可以精确地求解你的 ODE,因为如果
x' = A*x,
Run Code Online (Sandbox Code Playgroud)
那么精确解是
x(t) = exp(A*t)*x0
Run Code Online (Sandbox Code Playgroud)
我已经说过,精确解总是优于数值近似,但这确实比数值积分更快。正如您在评论中指出的,您担心效率。因此,不要计算每个 的矩阵指数t:仅计算一次的特征系统A:
A*v_i = L_i*v_i
Run Code Online (Sandbox Code Playgroud)
然后
x(t) = sum_i c_i*v_i*exp(L_i*t),
Run Code Online (Sandbox Code Playgroud)
并且系数c_i可以从线性方程确定
x0 = sum_i c_i*v_i.
Run Code Online (Sandbox Code Playgroud)
现在,只要矩阵不是奇异的,非齐次项就不会发生太大变化:
x' = A*x + b
(x - A^(-1)*b)' = A*(x - A^(-1)*b)
Run Code Online (Sandbox Code Playgroud)
因此我们可以求解齐次方程y = x - A^(-1)*b并在最后一步中恢复x = y + A^(-1)*b。
当矩阵是规则的时,这一切都很好,但在您的特定情况下它是奇异的。但事实证明,这是由于你的最终维度造成的:
>>> np.linalg.det(A)
0.0
>>> np.linalg.det(A[:-1,:-1])
1920987.0461154305
Run Code Online (Sandbox Code Playgroud)
还要注意 的最后一行A全为零(这就是 的奇异性的原因A)。因此 的最后一个维度x是恒定的(或由于 而线性变化b)。
我建议消除这个变量,重写其余变量的方程,并准确地使用上述过程求解 ODE 的非奇异非齐次线性系统。它应该更快、更精确。
以下内容会有点推测性,另请参阅最后的警告。
如果是用户输入A和b,事情可能会变得更加棘手。在矩阵中找到零行/列很容易,但A即使没有任何行/列完全为零,也可能是奇异的。我不是该主题的专家,但我认为最好的选择是使用类似于主成分分析的方法:根据 的特征系统转换方程组A。我的以下想法仍然会假设A是可对角化的,但主要是因为我不熟悉奇异值分解。在现实情况下,我希望你的矩阵可以对角化,即使是奇异的。
所以我假设矩阵A可以分解为
A = V * D * V^(-1),
Run Code Online (Sandbox Code Playgroud)
其中D是包含 的特征值的对角矩阵A, 的列是对应于每个特征值V的特征向量。A在 numpy 中可以使用以下方法获得完全相同的分解
DD,V = np.linalg.eig(A)
D = np.asmatrix(np.diag(DD))
Run Code Online (Sandbox Code Playgroud)
我通常更喜欢使用ndarrays 而不是矩阵,但这种方式V*D*np.linalg.inv(V)实际上对应于三个矩阵的矩阵乘积,而不是调用np.dot两次。
现在,再次重写你的方程:
x' = A*x + b
x' = V*D*V^(-1)*x + b
V^(-1)*x' = D*V^(-1)*x + V^(-1)*b
Run Code Online (Sandbox Code Playgroud)
通过定义辅助变量
X = V^(-1)*x
B = V^(-1)*b
Run Code Online (Sandbox Code Playgroud)
我们获得
X' = D*X + B
Run Code Online (Sandbox Code Playgroud)
即通常的非齐次形式,但现在是一个对角矩阵,其中对角线中D包含 的特征值。A
由于A是奇异的,因此某些特征值为零。在 中查找零元素D(好吧,您已经可以使用DDfrom做到这一点eig()),您就会知道它们在时间演化过程中的行为很平常。其余变量表现良好,尽管此时我们看到 的方程X由于D对角线而解耦,因此您可以独立地进行分析积分。为此,您需要首先从初始条件转到x0,X0 = np.linalg.inv(V)*x0然后在求解方程后返回到x = V*X。
警告:正如我所说,我不是这个主题的专家。我可以很容易地想象,对角化中涉及的反转在实际应用中可能是一个数值问题。因此,我首先测试矩阵是否是奇异的,并且只有在是(或接近是)时才继续执行此过程。上面的内容可能存在很多错误,在这种情况下数值积分可能会更好(我真的无法判断)。